“Diffusion  Geometry  Based  Nonlinear  Methods 
for  Hyperspectral  Change  Detection” 

Contract  Item  0001 AC  -  Final  Technical  Report 
Plain  Sight  Systems,  Inc. 


Air  Force  Office  of  Scientific  Research 
Contract  #  FA9550-09-C-0189 


Dr.  Andreas  Coppi  (POC) 
Dr.  Frederick  Warner 
Plain  Sight  Systems,  Inc. 
19  Whitney  Avenue 
New  Haven,  CT  06510 
(203)  285-8617 
coppi@plainsight.com 


Dr.  Ronald.  R.  Coifman  (POC) 

Dr.  Matthew  Him 

Program  of  Applied  Mathematics 

Yale  University 

PO  BOX  208283 

New  Haven,  CT  06520-8283 

(203)  432-1213 

coifman@math.  yale .  edu 


May  12,  2010 


REPORT  DOCUMENTATION  PAGE 


Form  Approved 
OMB  No.  0704-0188 


i 


The  public  reporting  burden  for  this  collection  of  information  is  estimated  to  average  1  hour  per  response,  including  the  time  for  reviewing  instructions,  searching  existing  data  sources, 
gathering  and  maintaining  the  data  needed,  and  completing  and  reviewing  the  collection  of  information.  Send  comments  regarding  this  burden  estimate  or  any  other  aspect  of  this  collection  of 
information,  including  suggestions  for  reducing  the  burden,  to  the  Department  of  Defense,  Executive  Services  and  Communications  Directorate  (0704-0188).  Respondents  should  be  aware 
that  notwithstanding  any  other  provision  of  law,  no  person  shall  be  subject  to  any  penalty  for  failing  to  comply  with  a  collection  of  information  if  it  does  not  display  a  currently  valid  OMB 
control  number. 

1  PLEASE  DO  NOT  RETURN  YOUR  FORM  TO  THE  ABOVE  ORGANIZATION. 

1.  REPORT  DATE /DD-/W/W-yyyV9 

2.  REPORT  TYPE 

3.  DATES  COVERED  (From  -  To) 

12-05-2010 

Final 

20090715  -20100414 

4.  TITLE  AND  SUBTITLE 

"Diffusion  Geometry  Based  Nonlinear  Methods  for  Hyperspectral  Change 
Detection" 

5a.  CONTRACT  NUMBER 

FA9550-09-C-0I89 

5b.  GRANT  NUMBER 

5c.  PROGRAM  ELEMENT  NUMBER 

6.  AUTHOR(S) 

Dr.  Ronald  Coifman 

Dr.  Andreas  Coppi 

Dr.  Matthew  Him 

Dr.  Frederick  Warner 

5d.  PROJECT  NUMBER 

5e.  TASK  NUMBER 

OOOIAC 

5f.  WORK  UNIT  NUMBER 

7.  PERFORMING  ORGANIZATION  NAME(S)  AND  ADDRESS(ES) 

Plain  Sight  Systems,  Inc.  -  19  Whitney  Avenue,  New  Haven,  CT  06510 

STTR  University  Partner:  Program  of  Applied  Mathematics,  Yale  University,  P.O.  Box 
208283,  New  Haven,  CT  06520-8283 

8.  PERFORMING  ORGANIZATION 

REPORT  NUMBER 

9.  SPONSORING/MONITORING  AGENCY  NAME(S)  AND  ADDRESS(ES) 

AF  OFFICE  OF  SCIENTIFIC  RESEARCH 

875  NORTH  RANDOLPH  STREET  ROOM  3112 

ARLINGTON  VA  22203 

10.  SPONSOR/MONITOR'S  ACRONYM(S) 

AFOSR 

11.  SPONSOR/MONITOR'S  REPORT 

NUMBER(S) 

12.  DISTRIBUTION/AVAILABILITY  STATEMENT 

UNLIMITED 

13.  SUPPLEMENTARY  NOTES 

14.  ABSTRACT 

Throughout  this  Phase  I  project,  we  have  integrated  a  suite  of  nonlinear  signal  processing  algorithms  derived  from  diffusion 
geometry  into  an  existing  proprietary  Hyperspectral  processing  toolbox.  These  methods  enable  the  organization  and  comparison  of 
spatiospectral  features  of  hyperspectral  images  acquired  under  different  conditions,  for  target  detection,  change  and  anomaly 
assessment.  The  main  ingredients  in  our  approach  involve  a  high  level  “geometrization”  of  spatio  spectral  signatures. 

We  developed  an  approach  to  simultaneously  segment  a  scene  in  terms  of  similarities  of  spatio  spectral  signatures  at  different 
inference  as  well  as  a  partition  of  the  feature  space  of  spectra  and  morphology  into  groups  of  features  related  to  the  various  locations 
on  the  scene.  We  refer  to  this  approach  in  which  we  interrogate  and  organize  both  the  pixels  and  their  responses  as  the  questionnaire 
organization  paradigm..  This  spectral  segmentation  methodology  is  critical  for  change  detection  as  it  enables  to  isolate  changes  by 
comparing  their  relation  to  their  spatio-spectral  folders.  The  folder  identity  provides  invariant  features  for  change  detection. 

15.  SUBJECT  TERMS 

Hyperspectral,  Change  Detection,  Remote  Sensing,  Target  Detection,  Non-Linear,  Processing 


16.  SECURITY  CLASSIFICATION  OF: 

17.  LIMITATION  OF 
ABSTRACT 

18.  NUMBER 
OF 

PAGES 

137 

19a.  NAME  OF  RESPONSIBLE  PERSON 

a.  REPORT 

b.  ABSTRACT 

c.  THIS  PAGE 

Dr.  Andreas  Coppi 

uu 

UU 

UU 

SAR 

19b.  TELEPHONE  NUMBER  (Include  area  code) 

203-285-8617 

Standard  Form  298  (Rev.  8/98) 

Prescribed  by  ANSI  Std.  Z39.18 


Summary  of  Results 


Throughout  this  Phase  I  project,  we  have  integrated  a  suite  of  nonlinear  signal  processing 
algorithms  derived  from  diffusion  geometry  into  an  existing  proprietary  Hyperspectral 
processing  toolbox.  These  methods  enable  the  organization  and  comparison  of  spatio- 
spectral  features  of  hyperspectral  images  acquired  under  different  conditions,  for  target 
detection,  change  and  anomaly  assessment.  The  main  ingredients  in  our  approach  involve 
a  high  level  “geometrization”  of  spatio  spectral  signatures. 

In  the  first  months  of  this  project  (see  Appendix  A),  we  developed  an  approach  to 
simultaneously  segment  a  scene  in  terms  of  similarities  of  spatio  spectral  signatures  at 
different  inference  as  well  as  a  partition  of  the  feature  space  of  spectra  and  morphology 
into  groups  of  features  related  to  the  various  locations  on  the  scene.  We  refer  to  this 
approach  in  which  we  interrogate  and  organize  both  the  pixels  and  their  responses  as  the 
questionnaire  organization  paradigm  described  in  Appendix  A.  This  spectral 
segmentation  methodology  is  critical  for  change  detection  as  it  enables  to  isolate  changes 
by  comparing  their  relation  to  their  spatio-spectral  folders.  The  folder  identity  provides 
invariant  features  for  change  detection. 

We  have  also  developed  a  tool  which  enables  the  automated  search  in  the  image  (viewed 
as  a  data  base)  which  operates  as  follows:  The  user  identifies  a  number  of  reference 
pixels  in  the  scene  (or  in  other  scenes),  and  obtains  an  image  in  which  only  pixels,  having 
some  affinity  with  the  selected  reference  pixels,  are  displayed.  Moreover  as  the  affinity 
among  spectra  is  defined  through  the  use  of  a  local  Mahalanobis  distances,  this  tool  can 
find  related  pixels  across  various  acquisitions.  We  have  tested  this  methodology  for 
matching  biological  spectra  across  a  data  base  of  hyperspectral  pathology  slides  acquires 
with  different  instruments  in  different  conditions,  as  well  as  on  hyperspectral  images  of 
Smith  Island  acquired  on  two  different  days  for  change  detection. 

The  same  methodology  also  extracts  automatically  independent  components  of  the 
spectrum  building  an  empirical  model  of  the  constituents  of  the  scene.  It  is  precisely 
through  this  model  that  most  efficient  target  search  and  change  detection  can  be 
performed. 

Plain  Sight  Systems,  Inc.  has  collaborated  with  Dr.  Coifman’s  group  in  the  Applied 
Mathematics  Department  of  Yale  University  for  algorithm  development,  and  has 
integrated  these  methods  into  their  proprietary  Hyperspectral  Explorer  software  package 
for  hyperspectral  information  organization  and  processing  (Appendix  D). 

The  main  software  development  tasks  have  been  finished  and  we  are  currently  applying 
and  validating  the  methods  to  hyperspectral  data  of  interest  to  the  Air  Force. 

The  enclosed  reports  (Appendix  B  &  C)  describe  some  of  the  algorithms  that  were  tested 
and  developed  in  the  latter  months  of  Phase  I.  It  is  expected  that  they  will  be  simplified 
considerably  and  enable  real  time  change  detection.  We  have  mostly  tested  a  variety  of 
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scaling  and  renormalizations  that  would  stay  invariant  across  instruments  and 
illumination  conditions. 

In  the  first  report  on  anomaly  and  target  detection  (Appendix  B),  we  have  viewed  the 
local  spectral  covariance  matrix  as  a  good  simple  spectral  invariant  because  it  quantifies 
the  relation  between  a  location  to  its  most  spectrally  similar  and  spatially  close  points.  As 
shown,  this  approach  is  quite  effective  in  isolating  anomalous  pixels.  In  the  second 
enclosed  report  (Appendix  C),  we  introduce  a  number  of  mathematical  calibration 
methods  to  relate  calibrated  spectra  on  two  days  for  change  detection.  Of  course,  this 
method  can  easily  be  combined  with  the  target  detection  approach  of  the  first  report. 

As  an  illustration  of  results  we  look  at  an  area  near  the  beach  on  two  different  days  (all 
spectra  are  different).  In  the  image  below  two  daily  images,  a  pixel  indicated  as  red 
below  has  been  modified  in  its  spectrum  and  is  actually  the  only  one  detected,  except  for 
the  red  area  off  the  beach  to  the  lower  right  which  is  simply  change  due  to  ocean  wave 
activity. 
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Summary  of  Theory 


To  summarize,  an  existing  signal  processing  toolbox  was  augmented  by  methods 
developed  in  Phase  I  to  extract  structure  and  information  from  heterogeneous  images  and 
other  data  sets.  These  methodologies  enable  efficient  integration  and  fusion  of 
heterogeneous  image  sources  with  information  processing  tasks  and  are  particularly  well 
adapted  to  hyperspectral  imagery  in  which  spectral  content  can  be  integrated  with 
geometric  image  features. 

The  approach  uses  the  network  of  inferences  and  similarities  between  the  data  points  to 
create  robust  nonlinear  estimators  for  missing  or  noisy  entries.  This  method  enables 
coherent  analysis  of  data  from  a  multiplicity  of  sources  generalizing  signal  processing  to 
a  nonlinear  setting.  By  building  empirical  data  models  it  achieves  nonlinear  decorrelation 
and  dimensionality  reduction  for  intrinsic  data  structures. 


We  start  by  discussing  feature  based  filtering  and  signal  processing  on  graphs  as  a  simple 
way  to  understand  the  effect  of  introducing  affinity  (similarity)  based  diffusions  on  image 
data.  For  simplicity  of  display,  we  start  by  considering  a  regular  gray  level  image  in 
which  we  associate  to  each  pixel  p  a  vector  v{p)  of  features.  For  example,  a  multiband 
electromagnetic  spectrum,  a  filter  bank,  or  the  simplest  of  all,  a  5x5  subimage  centered  at 
the  pixel,  or  any  combination  of  features  as  above.  Define  a  Markov  filter 


exp(- 

v(p)-v(q)  ^/s) 

v{p)-v{q)  f  Is) 

9 


The  image  I{q)  below  was  filtered  using  the  (nonlinear  in  the  features)  procedure 
described  above  where  the  feature  vector  is  the  5x5  patch  around  a  pixel. 


exp(- 

v{p)-v{q)  f  Is) 

■v{p)-v{q')  f  Is) 

I{q) 


Observe  that  the  edges  are  well  preserved,  as  patches  translated  parallel  to  an  edge  are 
similar  and  contribute  more  to  the  averaging  procedure.  We  should  also  observe  that  if 
we  were  to  repeat  the  procedure  on  the  filtered  image  we  would  get  a  numerical 
implementation  of  various  nonlinear  heat  diffusions  for  image  processing  as  done  by 
Osher  and  Rudin. 


(It  is  useful  to  replace  ^  by  a  Bi-Markovian  version 


exp(- v(;?)-v(^)  Is) 


(o{p)(o{q) 


where  the  weights  are  selected  so  that  A  is  Markov  in p  and  q.) 
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The  noisy  IR  image  below  was  filtered  using  a  vector  of  25  statistical  features  associated 
with  each  pixel 


The  Markov  matrix  used  for  filtering,  defines  a  diffusion  on  the  Graph  of  patches  or 
features  viewed  as  a  subset  of  25  dimensional  Euclidean  space.  The  eigenvectors  of  this 
diffusion  permit  us  to  compute  all  of  its  powers  and  to  define  a  multiscale  diffusion 
geometry  and  signal  processing  on  this  “image  graph”.  (By  viewing  the  image  as  a 
function  on  its  feature  graph,  in  which  v{p)  are  vertices  and  ^  are  the  weights  of  the 

edge  between  v{p)  and  v(^) ,  we  can  analyze  the  image  relative  to  its  features.) 

For  the  next  example,  consider  3  noisy  sensors  measuring  the  xyz-coordinates  of  a 
trajectory  in  three  dimensions  .We  could  try  to  denoise  each  coordinate  separately.  Or  use 
the  position  vector  as  a  feature  vector  as  we  did  for  the  images  above  (in  our  case  each 
sensor  could  be  the  intensity  of  each  spectral  band  and  the  denoising  is  achieved  only 
through  comparison  with  location  in  the  scene  with  similar  spectra). 


Plain  Sight  Systems,  Inc. 


-5- 


Proprietary  and  Confidential 


The  construction  above  should  be  viewed  as  signal  processing,  filtering,  on  the  data 
graph.  We  view  all  points  of  the  trajectory  as  a  data  graph,  i.e.  data  points  p  and  q  are 
vertices  and  ^  is  the  weight  of  the  edge  connecting  them  measuring  their  similarity  or 

affinity  at  the  smallest  scale.  We  consider  the  eigenvectors  of  the  Markov  matrix 

defined  above  as  a  basis  for  all  functions  on  this  graph.  We  can  then  expand  each 
coordinate  as  a  function  on  that  graph,  and  restrict  the  expansion  to  the  first  few  “low 
frequency”  eigenfunctions,  i.e.  filter  it  and  use  the  filtered  coordinates  as  a  clean 
trajectory.  This  generalizes  the  simple  filtering  done  on  images  above  see  figure  3  below. 


Fig  3.  The  green,  red  and  blue  curves  are  respectively  the  coefficients  of  the  xyz 
coordinates,  as  filtered  above,  using  less  than  10  eigenvectors  of  the  Markov  matrix. 


Diffusion  Geometries 

These  simple  examples  indicate  that  diffusion  and  harmonic  analysis  are  useful  for 
coherent  sensor  integration  and  fusion,  enabling  signal  processing  for  nonlinearly 
correlated  data  streams.  Diffusion  geometries  enable  the  definition  of  affinities  and 
related  scales  between  any  digital  data  points  in  R"  (provided  of  course  that  the 
“infinitesimal  proximity”  in  the  coordinates  corresponds  to  true  affinity  between  data 
points).  Moreover  it  enables  the  organization  of  the  population  of  sensor  output  into 
“affinity  folders”  or  subsets,  at  different  scales,  with  a  high  level  of  affinity  among  their 
responses. 

In  particular  the  eigenfunctions  of  the  Diffusion  operator  or  equivalently  a  Laplacian  on  a 
graph  provide  useful  empirical  coordinates,  which  enable  an  embedding  of  the  data  to 
low  dimensional  spaces  so  that  the  diffusion  distance  at  time  t  on  the  original  data 
becomes  Euclidean  distance  in  the  embedding,  in  effect  providing  a  nonlinear  version  of 
the  SVD  as  well  as  a  powerful  dimensional  reduction  methodology.  Moreover  we 
indicate  how  the  diffusion  at  different  times  leads  to  a  multiscale  analysis  generalizing 
wavelets  and  similar  scaling  mechanisms. 
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To  be  specific,  let  the  bi-Markov  matrix^  defined  above  be  represented  in  terms  of  its 
eigenvectors,  and  define  the  diffusion  map  at  time  t  into  m  dimensional  Euclidean  space 
by 


For  a  given  t,  we  determine  m  so  that  is  negligible.  The  diffusion  distance  at  time  t 


between  X ^  and  is  given  as 


t 


|2 


As  an  application  we  show  the  advantage  of  using  diffusion  distance  over  ordinary 
Euclidean  distance,  in  the  context  of  hyperspectral  imagery  and  tracking.  The 
hyperspectral  image  below  has  three  sets,  labeled  as  green,  red  blue.  The  task  is  to 
classify  the  rest  of  the  image  by  spectral  similarity  to  the  labeled  sets.  A  nearest  neighbor 
classifier  using  Euclidean  distance  fails  due  to  regional  drift  in  spectra  while  the  diffusion 
distance  measuring  all  chains  of  similarity  linking  a  given  pixel  to  the  labeled  set  does 
remarkably  well. 


Conventional  nearest  neighbor  search  ,  compared  with  a  diffusion  search.  The  data  is 
a  pathology  slide  ,each  pixel  is  a  digital  document  (spectmm  below  for  each  class ) 


256  X  256  image  with  861  labeled  pts 


GREEN  (969  pts)  BLUE  (1142  pts) 
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The  diffusion  map  enables  us  to  represent  geometrically  an  abstract  set  of  measurements 
on  a  sensor  array  (measurement  space)  and  measure  directly  diffusion  distances  in  the 
low  dimensional  representation.  Diffusion  geometry  enables  a  multiscale  organization 
(in  feature  space)  of  pixels  in  a  scene  as  illustrated  in  the  following  urban  scene 


Where  the  left  is  the  original  image  while  the  right  is  spectrally  segmented.  This  method 
identifies  not  sensitive  to  the  selection  of  bands  or  to  illumination  and  atmospheric 
distortions,  thereby  providing  invariant  features.  The  segmentation  above  is  obtained  by 
clustering  at  different  scales  in  the  diffusion  embedding  space  as  seen  below 
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Another  methodology  for  defining  invariant  spectral  features  will  be  described  as 
follows. 


The  next  images  illustrates  the  organizational  ability  of  the  diffusion  maps  on  a  collection 
of  images  of  the  text  “3D”  imaged  in  various  random  orientations  relative  to  the  camera 
and  light  as  reordered  by  the  diffusion  mapping  given  by  the  first  two  nontrivial 
eigenfunctions.  We  see  that  the  intrinsic  geometry  has  emerged  automatically. 


^«3&D3q3D 

-iPtlaOD  3d3«^d  • 

,p303I^D3D3D3OD  3p3D  ,'.|. 

3l%D3D3D  3I3D3d3D3D3|) 

3D3D3diD  3l%D3p3D:',n,|, 


3Q3I5J-- 


Intrinsic  parameterizations  of  hyperspectral  images 

As  discussed  previously  it  is  possible  to  mathematically  reparameterize  spectral  data  to 
enable  robust  structural  change  detection.  This  procedure  extends  the  methodologies 
described  in  [9]-[12]  where  PC  A  is  used  to  renormalize  spectral  data  to  enable 
comparison  between  two  acquisitions.  Here  we  use  local  whitening  to  build  a  global 
explicit  parameterization  invariant  under  nonlinear  perturbations  of  the  spectrum, 
calibrating  the  data  independently  of  the  measurement  modes  (or  even  sensors,  provided 
that  they  are  affected  by  the  same  parameters). 

More  specifically,  we  let 

^  _  exp((c;  (g  -  g  -  O', )) + (g-,  -  g-,  -  g-, )) 

Where  C^  ‘  is  the  inverse  of  the  covariance  matrix  of  the  spectra  cr  in  a  small 
region  around  p,  and  a  is  a  spectrum  at  p  (  or  another  feature  vector  which  ,  such  as,  all  9 
spectra  in  3x3  block  centered  at  p) 

If  we  assume  a  generalized  nonlinear  Beer’s  law  i.e.  that 
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=^(c,c,,..c  )  =  F(c) 


where  the  vector  c  represents  the  concentration  (or  “amount”)  vector  of  r  independent 
constituents  affecting  the  measured  spectrum.  Then  we  can  find  among  the  first  few 
eigenvectors  of  the  matrix  A,  r  monotone  functions  of  the  r  independent  constituents  in  c, 
(the  underlying  mathematical  assumption  is  that  the  inverse  of  the  covariance  matrix  is 
the  “square  “  of  the  Jacobian  of  the  inverse  of  F  from  data  to  parameter  c).  We  show  that 
the  process  of  computing  the  eigenvectors  of  A  solves  a  nonlinear  differential  equation 
going  from  data  to  intrinsic  parameters.  Moreover  we  provide  methodologies  for 
extending  these  eigenvectors  ^  (cr  )  from  the  measured  data  to  any  vector  (7  in  the 

same  dimension. 

This  construction  is  a  natural  nonlinear  generalization  of  principal  component  analysis  as 
well  as  independent  component  analysis.  The  remarkable  property  of  this  construction  is 
that  any  random  encoding  of  (7 ,  say  by  projecting  the  spectra  on  a  random  collection  of 
r  +1  vectors  leads  to  the  same  parameterization  of  the  independent  constituents  (with 
high  probability),  thereby  reducing  the  complexity  of  acquisition  to  the  number  of 
relevant  parameters,  as  opposed  to  full  resolution  spectra.  In  the  picture  below  we  show 
that  classification  of  tissue  type  can  easily  be  achieved  through  a  small  number  of 
encoded  light  measurements 


The  image  on  the  left  is  an  RGB  representation  of  the  encoded  measurements;  whereas 
on  the  right,  they  have  been  organized  to  by  diffusion  geometry  to  provide  intrinsic 
biological  parameters  quantifying  tissue  constituents.  The  image  on  the  right  is 
independent  of  the  selection  of  spectral  encodings. 


Fusion  of  spatial  and  spectral  information 

For  change  detection  in  a  heterogeneous  environment,  we  can  view  each  set  of  features  as 
corresponding  to  different  sensor,  say  spatial  features,  or  spectral  features,  each  category 
of  features  can  be  parameterized  and  normalized  in  its  intrinsic  diffusion  coordinates.  A 
new  graph  is  then  created  combining  the  relevant  diffusion  coordinates  emanating  from 
different  species  of  features  as  coordinates.  This  goal  can  also  be  achieved  through  direct 


Plain  Sight  Systems,  Inc. 


-10- 


Proprietary  and  Confidential 


concatenation  of  spectral  patches  in  the  seene,  the  point  being  that  the  spatial  distribution 
of  spectra  may  be  a  more  robust  indicator  of  change  than  single  pixel-by-pixel  match. 
Moreover  this  methodology  also  provides  a  powerful  registration  tool  as  it  enables  the 
matching  of  distributed  spatio  spectral  features  in  both  images  independently  of  their 
position. 
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Appendix  A:  Hyperspectral  Image  Organization  Through 
Spectral  “Questionnaires”  (PowerPoint  Presentation) 
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•The  questionnaire  approach  is  being  tested  to  extract  anomalies  and  change 
between  two  hyperspectral  acquisitions  of  the  same  scene.  After  local 
Mahalanobis  whitening,  we  build  a  joint  questionnaire  model  for  both  images,  and 
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We  use  diffusion  geometries  as  a  tool  to  organize  hyperspectral  data 
or  more  generally  vector  valued  images,  or  any  digital  database. 
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We  can  use  this  geometry  to  organize  our  data-  a  collection  of 
points  in  high  dimensions,  i.e.  spectra  of  individual  pixels-  into  a 
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Conventional  nearest  neighbor  search  ,  compared  with  a  diffusion  search.  The  data 
is  a  pathology  slide  ,each  pixel  is  a  digital  document  (spectrum  below  for  each  class  ) 


Our  search  Projection  based  on  diffusion  geometry  provides  the  bottom  image  on  the  right 


A  simple  empirical  diffusion/inference  matrix  A  can  be  constructed  as  follows 

Let  X  represent  normalized  data  (they  are  simply  rows  of  a  data  matrix)  yWe  ^^soft 
truncate  the  covariance  matrix  defining  an  infinitesimal  affinity  as 
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The  same  image  is  segmented 
into  folders  of  pixels  whose 
spectra  are  similar,  bottom  left. 


We  view  the  hyperspectral  image  as  a  questionnaire  in  which  each  pixel  is 
interrogated  spectrally,  and  each  of  the  206  spectral  bands  is  a  response. 

This  database  displayed  here  for  a  1000  pixels  is  then  organized  as  a  graph  of 
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The  tree  of  scene  segmentation  by  spectral  similarity  displayed 
previously  in  the  embedding  and  the  image 


Use  left/right  arrows  (point  folder)  or  click  on  a  tree  node  to  choose  point  level  and  folder 
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Use  up/down  arrows  (sensor  folder)  or  click  on  a  tree  node  to  choose  sensor  level  and  folder 


Mutual  Organization  /  Tree  Structures  for  context-  concept  duality. 
Although  we  use  linguistic  analogies  these  trees  were  built  on  time  series  of  observations  of 
500  objects  ,  the  concepts  are  scenarios  of  times  with  similar  responses  among  the 
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The  image  on  the  right  is  a  good  segmentation  of  the  textures . 
Observe  that  no  assumptions  or  filters  were  given.  This  can  be 
done  as  easily  without  using  the  FT. 


The  point  is  that  any  anomalous  inconsistent  response  is  visible  as  noise  on  the 
bar  code. 
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Technical  Report: 

Target  Detection  Using  Diffusion  Geometry  and 
Local  Covariance  Matrices 


1  Introduction 

In  this  technical  report  we  examine  the  target  detection  problem  for  hyperspec- 
tral  imagery  (HSI)  data.  We  denote  the  HSI  data  set  as 

Here  I  and  w  denote  the  physical  dimensions  of  the  data  set,  while  D  denotes 
the  spectral  dimension.  One  may  think  of  each  Xij  as  a  pixel  vector  in  with 
geographic  coordinates  given  by  (i,  j).  We  assume  throughout  the  report  that 
Wxijh  =  1  for  all  j. 

We  shall  consider  targets  to  be  small,  anomalous  features  within  the  data  set. 
By  small  we  mean  a  few  pixels;  by  anomalous  we  mean  that  the  target  pixels 
should  be  different  than  the  surrounding  background  pixels. 

2  The  algorithm 

The  target  detection  algorithm  can  be  broken  into  three  main  steps.  In  the  first 
step  we  compute  a  local  covariance  matrix  for  each  pixel;  the  second  step  then 
computes  diffusion  maps  based  on  these  local  covariance  matrices.  Finally,  the 
targets  are  extracted  from  the  diffusion  maps. 

2.1  Local  covariance  matrices 

Consider  the  pixel  Xij.  To  compute  the  local  covariance  matrix  for  Xij,  we  first 
take  a  square  ball  around  Xij  of  radius  r.  By  square  ball  of  radius  r  we  just 
mean  a  square  with  side  length  2r  +  1  centered  at  Xij .  We  shall  denote  this  ball 
as  Br{xij)  and  formally  define  it  as: 

Br{xij)  =  {Xifjf  G  T  :  G  Zn  [i  -  r,  i  +  r]  n  [1,  /],  j'  G  Z D  [j  -  r,  j  +  r]  D  [1,  w]}. 

We  then  take  a  local  neighborhood  from  within  the  ball  Br{xij).  This  neighbor¬ 
hood  is  consists  of  Xij  plus  the  k  closest  neighbors  to  Xij  from  within  Br{xij) 
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according  to  their  inner  product  with  Xij.  Let  ym  denote  the  elements  of  Br{xij) 
sorted  according  to  their  inner  product  with  Xij^  so  that: 


{Xij.ym)  >  {Xij,yn)  if  and  only  if  m  <  n. 


Note  that  yi  =  Xij.  We  denote  this  neighborhood  as  Afk,r{xij)  and  formally 
define  it  as: 

J^k,r{Xij)  =  {ym  ^  Br{Xij)  :  1  <  171  <  k  ^  1} . 

We  then  compute  a  (/c  +  1)  x  (/c  +  1)  local  covariance  matrix  from  the 
neighborhood  Mk,r{xij).  Let  {irn  =  niean(^^);  then  the  {m^n)  entry  of  Cx^j  is 


given  by: 


p=i 


These  local  covariance  matrices  capture  the  statistics  of  the  spectral  neighbors  to 
Xij  that  are  within  a  prescribed  geographic  radius.  One  can  see  their  usefulness 
through  the  following  examples. 

2.1.1  Examples 

Consider  figure  1,  which  is  a  patch  of  dimension  31x31x161  depicting  a  road 
cutting  through  grass.  A  2  x  2  target  has  also  been  added,  which  in  this  case  is 
part  of  a  house  from  elsewhere  in  the  larger  image. 


Figure  1:  Pseudo-color  image  of  patch 


In  figure  2  we  highlight  a  pixel  taken  from  the  grass  (highlighted  in  blue),  and 
depict  the  boundary  of  its  square  ball  of  radius  r  =  2  (highlighted  in  green)  as 
well  as  its  k  =  b  closest  neighbors  (highlighted  in  red).  The  associated  local 
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Figure  2:  Grass  pixel  and  neighbors 


covariance  matrix  is  given  by: 


grass 


/  0.0063 
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0.0062 
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0.0062 

0.0062 

0.0062 

0.0062 

We  note  that  the  local  neighbors  of  this  are  also  grass  pixels,  and  as  such  have 
similar  spectral  signatures.  Thus  the  local  covariance  matrix  is  nearly  a  con¬ 
stant  (in  this  case  1/160)  times  a  matrix  of  ones. 


Now  consider  a  second  pixel,  this  one  taken  from  the  road;  see  figure  3.  Notice 
that  the  k  =  b  closest  neighbors  are  all  taken  from  the  road  as  well,  and  that 
none  come  from  the  grass.  This  implies  that  the  local  covariance  matrix  of  the 
road  pixel  should  be  similar  to  the  local  covariance  matrix  of  the  grass  pixel. 
Indeed,  the  local  covariance  matrix  of  the  road  pixel  is: 


0.0063 
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/ 

Finally,  we  consider  a  pixel  taken  from  the  target  house;  see  figure  4.  Unlike 
the  previous  two  examples,  since  the  target  is  both  anomalous  and  small,  some 
of  the  neighbors  of  the  target  pixel  must  in  fact  not  be  spectrally  similar.  In 


3 


Figure  3:  Road  pixel  and  neighbors 


Figure  4:  Target  pixel  and  neighbors 
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this  case,  two  neighbors  of  the  house  pixel  are  grass  pixels.  As  such,  we  would 
expect  the  local  covariance  matrix  to  indicate  this;  indeed,  we  have: 


a 


target 


0.0062 
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0.0062 
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Computing  the  Frobenius  norm  (denoted  ||  •  \\f)  of  the  differences  between  the 
local  covariance  matrices,  we  get: 


\\Cgrass  —  CroadWr  =  0.00092 

\\Cgrass  —  CtargetWr  =  0.0083 

\\Croad  -  CtargetWr  =  0.0078 


Comparing,  we  see  that  the  norm  of  the  difference  between  the  two  non-target 
(background)  pixels  is  an  order  of  magnitude  smaller  than  the  norm  of  the 
difference  between  a  target  pixel  and  a  background  pixel. 


2.2  Diffusion  maps 

Using  the  local  covariance  matrices  and  the  Frobenius  norm  we  define  the  fol¬ 
lowing  distance  on  the  data  set  A : 


As  illustrated  above,  this  distance  has  the  property  that  the  distance  between 
two  background  pixels  will  be  small,  whereas  the  distance  between  a  background 
pixel  and  a  target  pixel  will  be  large.  Using  this  distance,  we  define  the  following 
kernel  for  the  data  set  A: 

k{xij^Xi>j>)  =  e  *  j  ,  5  >  0. 


Now  set 


Xi,^,eX 


and  then  define  the  normalized  kernel  p  as: 


ki^Xij ,  Xi'j' ) 
U>{Xij) 


We  now  reorder  the  elements  of  X  so  that  they  are  in  list  form,  and  denote  this 
reordering  as  (slightly  abusing  notation): 

X  =  {xi}^^^,  N  =  l-w. 
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Define  the  N  x  N  matrix  P  as: 


Pij  =  p{Xi,Xj). 

We  compute  the  eigenvectors  and  eigenvalues  of  P,  which  we  denote  as 

and  respectively.  Note  that  is  constant  and  that  1  =  Aq  >  |Ai|  > 

IA2I  >  ....  The  diffusion  mapping  is  then  given  by: 


Xi  ^  ^{xi)  =  (Ai'0i(i), . . . ,  Xs'ipsi'i))  ^ 

In  the  above  line  5  is  defined  to  be  the  unique  value  such  that: 

_  ELlM 

where  0  <  ^  <  1  is  an  accuracy  parameter.  Note  that  the  eigenvalues  will  decay 
very  fast  due  to  the  fact  that  most  pixels  Xi  and  Xj  will  satisfy  k{xi,Xj)  ~  1. 
Thus  taking  a  large  S  would  allow  one  to  retain  most  of  the  information  in  the 
diffusion  embedding,  but  would  still  result  in  a  small  value  of  s.  Furthermore, 
due  to  the  fast  decay  of  the  eigenvalues,  in  practice  one  need  not  compute  all 
of  them,  but  rather  a  large  enough  amount  so  that  the  value  of  5  is  computed 
accurately. 

2.2.1  Example 

We  now  return  to  the  data  set  from  figure  1.  Setting  r  =  2,  /c  =  5,  5=1,  and 

5  =  0.99,  we  compute  the  local  covariance  diffusion  map  for  this  data  set.  Using 
these  parameters  and  the  definition  of  5,  we  kept  5  =  7  eigenvectors.  A  scatter 
plot  of  versus  A2'02  is  given  in  figure  5.  The  red  circles  correspond  to  the 
target  pixels,  while  the  blue  circles  correspond  to  the  remaining  pixels.  Figures 

6  and  7  show  the  first  two  eigenvectors. 

2.3  Extracting  the  targets 

We  now  extract  the  targets  based  on  the  diffusion  coordinates.  As  exhibited 
in  figure  5,  the  diffusion  coordinates  of  the  target  pixels  should  be  Tar  away’ 
from  the  diffusion  coordinates  of  the  background  pixels.  Rather  than  computing 
all  pairwise  diffusion  distances  though,  we  note  that  the  diffusion  norm  of  the 
background  pixels  is  small  relative  to  the  diffusion  norm  of  the  target  pixels. 
Therefore,  to  extract  the  targets,  we  examine  the  norm  of  each  new  diffusion 
map,  that  is  ||T(x^)||2. 

2.3.1  Example 

We  return  once  more  to  the  data  set  from  figure  1.  Using  the  diffusion  maps 
computed  in  section  2.2.1,  we  compute  the  norm  of  the  diffusion  map  for  each 
pixel.  The  results  are  given  in  figure  8.  After  suitable  thresholding,  the  lower 
size  norms  fall  out  and  only  the  targets  remain;  see  figure  9. 
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Figure  5;  Scatter  plot  of  first  two  eigenvectors 
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Figure  7:  Second  eigenvector 


Figure  8:  Diffusion  norms 


Figure  9:  Thresholded  diffusion  norms 


3  Experiments 

We  now  present  some  further  experiments. 

3.1  Multiple  targets 

We  consider  a  100  x  100  x  161  patch  to  which  we  have  added  4  targets  of  various 
sizes  and  materials;  see  figure  10  for  a  pseudo-color  image.  The  four  targets  and 
their  dimensions  are: 

1.  car,  1x2 

2.  part  of  a  church,  3x3 

3.  pool,  2x2 

4.  house,  4x3 

We  ran  the  local  covariance  diffusion  maps  algorithm  with  the  following  settings: 
r  =  3,  /c  =  15,  5  =  1,  and  S  =  0.8.  The  number  of  eigenvectors  retained  was 
s  =  4.  A  scatter  plot  of  the  first  two  eigenvectors  is  shown  in  figure  11;  the 
colors  represent  the  following  pixels: 

1.  dark  blue  -  background  pixels 

2.  light  blue  -  car 

3.  green  -  church 

4.  orange  -  pool 

5.  red  -  house 
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Figure  10:  Pseudo-color  image 


The  four  eigenvectors  are  shown  in  figure  12.  The  norms  of  each  pixel’s  diffusion 

map  are  given  in  figure  13;  a  thresholded  version  is  given  in  figure  14. 

3.1.1  Evaluation  of  results 

Examining  figures  13  and  14  more  closely,  we  make  the  following  points: 

•  In  figure  13  each  target  pixel  is  highlighted. 

•  Also  in  figure  13  it  appears  that  there  may  be  some  other  cars  on  the  road 
that  were  part  of  the  original  image. 

•  The  brightest  pixels  are  shown  in  figure  14.  In  this  figure,  we  see  that 
at  least  one  pixel  from  each  target  is  highlighted.  More  specifically,  the 
entire  house  is  highlighted,  as  well  as  the  entire  church;  3/4  pool  pixels 
are  visible,  and  1/2  car  pixels  are  visible. 

•  A  closer  examination  of  the  two  missing  pixels  in  figure  14  reveals  that 
they  are  most  likely  mixed  pixels,  which  could  explain  their  lower  intensity. 

•  There  is  one  faint  false  positive  pixel  in  figure  14. 

4  Possible  Extensions 

4.1  Generalized  covariance  matrices 

Suppose  that  we  normalize  the  data  set  A  such  that  not  only  ||:r^j||2  =  1  but 

also  mean(xij)  =  0.  The  local  covariance  matrix  for  Xij  would  then  be: 
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Figure  11:  Scatter  plot  of  first  two  eigenvectors 
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Figure  12:  Eigenvectors 


(a)  Eigenvector  1 


-6 


-8 


(b)  Eigenvector  2 


Figure  13:  Diffusion  norms 
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Figure  14:  Thresholded  diffusion  norms 


Thus  we  are  using  the  inner  product  as  a  measure  of  the  similarity  between 
Xij  and  the  pixels  in  its  local  neighborhood.  We  could  perhaps  improve  the 
results  by  using  a  more  sophisticated  measure  of  similarity,  which  we  denote  by 
the  general  function  /.  Thus  we  have  a  generalized  covariance  matrix: 

(C.y)  mn  —  fiy  mi  Vn)- 
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1  Introduction 

In  this  technical  report  we  examine  the  change  detection  problem  for  hyperspec- 
tral  imagery  (HSI)  data.  We  assume  that  we  have  two  hyperspectral  data  sets 
and  which  depict  the  same  scene,  but  were  taken  at  different  times: 

Here  I  and  w  denote  the  physical  dimensions  of  the  data  set,  while  D  denotes 
the  spectral  dimension.  One  may  think  of  each  as  a  pixel  vector  in 
with  geographic  coordinates  given  by  (i,  j).  These  coordinates  are  collected  in 
the  set  X,  i.e. 

1=  {(cj)  ■■iezn[i,l],  j  ezn  [i,w]}- 

The  idea  behind  change  detection  is  to  determine  the  anomalous  changes  that 
occur  from  day  one  to  day  two.  Atmospheric  changes,  changes  in  the  lighting, 
and  other  changes  that  affect  the  entire  area  should  not  be  taken  into  account. 

1.1  Data  normalization 

Without  some  sort  of  data  normalization,  accurately  detecting  changes  between 
the  two  days  is  nearly  impossible.  To  that  end  we  normalize  both  data  sets  so 
that: 

=  1  and  mean(x^^J^)  =  0. 

Note  that  normalizing  the  norm  so  that  it  equals  one  for  each  data  point  seems 
to  be  the  key  part.  Setting  the  mean  of  each  vector  to  zero  seems  to  have  less 
influence  on  the  final  outcome;  perhaps  it  is  not  necessary  or  something  else  is 
more  appropriate  (such  as  subtracting  out  the  mean  of  the  data  set). 


2  Simple  solutions 

We  present  some  possible  simple  solutions. 
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2.1  Inner  product 

One  can  use  the  inner  product  as  a  measure  of  similarity,  i.e.: 

dipix\j\4f)  =  ai'ccos((a;,^]\a;|f )).  (2.1) 

See  section  6  for  some  examples. 


2.2  distance 

One  can  use  the  £‘^  distance  as  a  measure  of  similarity,  i.e.: 

de2  ,  x^f)  =  \\x\f  -  x\f  II2  (2.2) 

See  section  6  for  some  examples. 


2.3  Total  distance 


One  can  also  compare  the  distances  within  one  day  to  the  distances  within  the 
second  day.  More  specifically,  we  compute: 


^total 


de2{x\]\ 


.(1) 


1  2 


(2.3) 


See  section  6  for  some  examples. 


2.4  PCA  distance 


One  can  first  compute  the  PCA  mapping  of  each  day,  normalizing  so  that  each 
new  dimension  has  variance  one;  denote  this  mapping  as: 

=  {4f  ■■  (ij)  e  C 

Note  that  the  dimension  d  can  either  be  selected  or  one  can  set  it  as  the  dimen¬ 
sion  required  to  retain  a  certain  percentage  of  the  energy  in  the  data  set.  The 
distance  between  pixels  is  then  given  by: 


^PCA 


.(2)^ 


v:(l)  _r(2)| 


(2.4) 


3  Local  coordinates  for  change  detection 

We  compute  new  local  coordinates  for  each  data  set.  To  do  this  we  first  take 
a  square  ball  around  of  radius  r.  By  square  ball  of  radius  r  we  just  mean 
a  square  with  side  length  2r  +  1  centered  at  x  •  .  We  shall  denote  this  ball  as 

Br{x^^^)  and  formally  define  it  as: 

^rix\f)  =  {4/  e  elr][i-r,i  +  r]x[j  -r,j  +  r]}. 
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We  then  compute  PC  A  coordinates  for  each  ball,  and  retain  99%  of  the  energy 
(the  99%  number  can  be  changed  of  course);  let  dij  be  the  dimension  required  by 
PC  A  to  retain  this  amount  of  the  energy.  We  set  }  C  to  denote  the  PC  A 

coordinates  of  and  let  ^  be  the  corresponding  eigenvalues 

of  each  PCA  dimension.  Before  proceeding  we  normalize  the  PCA  coordinates 
so  that  each  dimension  has  variance  equal  to  one;  that  is  we  compute: 


(I),-- 
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We  now  mark  the  coordinates  of  the  pixels  in  that  are  spectrally  close 

to  Xij  in  the  PCA  coordinates.  Set  we  compute: 


:  Wvlf  -  y^^,\\2  <  rlf}. 


Using  these  coordinates  we  form  a  spectral  neighborhood  of  x[^^  from  within 
the  geographic  ball  Br{x^^^): 


da) 


(a)x 


da). 


We  now  use  this  local  neighborhood  to  compute  three  representative  of  x[^^ 
which  we  will  later  use  to  compute  graphs.  The  first  of  these  is  the  average 
vector  of  Af  («)  r): 


(a)  _ 


Xh  = 


I 


d“d) 


E 


(i',iOeX  (c)(a:U);r) 


(3.1) 


To  compute  the  other  two,  we  first  compute  the  D  x  D  covariance  matrix  of 
AT  (a)  r),  which  we  shall  denote  Let  k  =  \X  (a)  (x-^^;  r)|,  and  note 

that  \ik  <  D  (which  is  usual),  then  the  rank  of  is  k.  We  then  compute  the 

k  nonzero  eigenvalues  of  and  their  corresponding  eigenvectors;  we  denote 
these  two  sets,  respectively,  as  follows: 

Mx\f)  =  {Ai(a;|“y  . . . ,  Xk{x\f)},  (3.2) 

Vrix\f)  =  {v,[xlf  v,[x\f]}.  (3.3) 

Combining  equations  (3.1),  (3.2),  and  (3.3),  we  have  the  following  map  on  the 
set 

^if  ^  (d“^’  ^rixlf),  Vrix'lf)'^  .  (3.4) 
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3.1  Constructing  the  graphs  and  comparing  them 

Given  the  new  coordinates  defined  in  (3.4),  we  now  use  them  to  construct  graphs 
pertaining  to  and  The  weights  of  these  graphs  are  given  by  the 

following: 


1 


(3.5) 


Taking  a  =  1  and  a  =  2  will  give  us  two  graphs,  one  for  day  one  and  one  for 
day  two.  We  compare  these  two  graphs  in  order  to  gain  a  similarity  measure 
between  and  .  This  similarity  measure  is  defined  as: 


dtotal  Tjri^ij  ^  V  Vr{^ 


ij  • 


4i')  -  Vr{x 


^44 


1  2 


(3.6) 


We  can  then  plot  the  entries  of  dtotai  r]r  f  ^  get  a  two  dimensional  map  depicting 
which  pixels  changed  and  which  did  not.  See  section  6  for  examples.  Note  that 
the  dtotai  r]r  <^totai  ^2  map,  except  that  it  is  computed  with  these 

new  coordinates  and  with  the  distance  defined  by  (3.5). 


3.2  Diffusion  maps 


We  can  take  the  previous  section  one  step  further  by  computing  diffusion  maps 
based  on  the  distance  defined  in  (3.5).  More  specifically,  define  the  following 
kernel: 

krA4f,4"i')  =  e  >  0. 


Now  set 

‘^r,s  ix\f)=  i' )  ’ 

{i',j')ex 

and  then  define  the  normalized  kernel  Pr,e  as: 


(3.7) 


(3.8) 


Let  be  the  N  X  N  (where  N  =  hw)  matrix  associated  with 
We  compute  the  eigenvectors  and  eigenvalues  of  Pr^\  which  we  denote  as 
and  respectively.  Note  that  is  constant  and  that 

1  =  Co  ^  ^  ^1  —  —  The  5-dimensional  diffusion  mapping  at  time  t 

of  day  a  is  then  given  by: 


44 


(a) 
r,£,t  V 


(4?)  =  ((d“0‘4^(44---,  (e4)T4(4?)) 
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We  can  now  detect  change  by  examining  the  following  map: 


Pr,£,t 


= 


ji) 


(2) 


See  section  6  for  examples. 


(3.9) 


Remark:  See  the  appendix,  sections  7.1  and  7.2,  for  a  note  on  a  technical  issue 
concerning  the  computation  of  the  eigenmaps. 


3.3  Symmetrizing  the  distance  and  the  kernel 

In  the  computation  of  the  diffusion  maps  detailed  in  the  previous  section,  3.2, 
both  the  distance  ijr  and  the  diffusion  kernel  Pr^s  were  not  symmetric.  These 
two  issues  can  be  remedied  in  the  following  ways.  First,  to  address  the  issue 
concerning  we  can  redefine  the  initial  kernel  given  in  equation  (3.7)  as  follows: 


Secondly,  we  can  symmetrize  the  normalized  kernel  Pr.e  given  in  equation  3.8 
by  redefining  it  as: 


Pr, 


.<?>) = 


Ul). 


where  Cor^s  is  defined  the  same  as  Ur^e  except  that  it  sums  entries  of  kr^s- 


4  Global  coordinates 

We  can  modify  the  local  coordinates  described  in  section  3  to  search  globally 
(geographically  speaking)  for  spectral  nearest  neighbors,  as  opposed  to  the  local 
search  described  earlier.  Mathematically  speaking,  this  is  equivalent  to  taking 
r  =  oo  above;  however,  we  make  a  few  algorithmic  adjustments,  as  well  as  one 
mathematical  adjustment  due  to  memory  considerations. 

Since  we  are  performing  a  global  search,  we  can  apply  PC  A  once  to  the  entire 
data  set  we  denote  this  set  as: 

4c“i  =  Af  ■■  (ij)  e  C 

As  in  the  local  version,  we  assume  that  this  set  has  been  normalized  so  that 
the  variance  in  each  dimension  is  one.  We  also  take  as  many  dimensions  as  are 
necessary  to  retain  99%  of  the  energy. 

For  each  pixel  we  now  perform  a  global  search  for  spectral  neighbors.  We 
take  =  \\x[^^  II2  to  be  the  radius  of  the  spectral  ball  that  any  neighbor  must 
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fall  within,  and  we  also  let  ko  denote  the  maximum  number  of  neighbors  we  are 
willing  to  allow  (this  is  mainly  due  to  memory  considerations  on  the  computer) . 
Formally,  we  define: 


and 


^ko 


G  X  :  is  one  of  the  ko  closest  spectral  neighbors  of 


The  neighbors  of  are  then  given  by: 


Af 


(«) 


,ko 


{x\f)  =  {4"/  G  :  {i\f)  e 


As  in  section  3,  we  now  compute  the  average  vector  of  Af^(oc)  ^  as  well 

ij  ’ 

as  its  covariance  matrix,  Let  x\^'^  denote  the  average  vector,  and  let 

Aoo,feo4^^)  ^00,  ko{x\^^)  denote  the  eigenvalues  and  eigenvectors  of 
respectively.  We  then  have  the  following  map: 


^ij 


(x\f  ,  Aoo,fco  (xlf),  Voo,feo 


(a) 


(4.1) 


We  also  have  the  maps  r]oo,ko  and  dtotai  r]^  ko  ?  which  are  identical  to  rj^  and 
^^totai  r]r  5  respectively,  except  for  the  fact  that  the  new  maps  use  the  global 
coordinates  defined  in  (4.1).  See  section  6  for  examples. 


4.1  Diffusion  maps 

As  in  section  3.2,  we  can  compute  the  diffusion  maps  based  on  r]oo,ko‘  The 
details  are  exactly  same,  except  that  r]r  is  replaced  with  r^oo,/co-  We  denote  the 
resulting  diffusion  similarity  measure  by  Poo,/co,£,t- 


5  Comparing  nearest  neighbors 

To  detect  change  one  can  compare  the  nearest  neighbors  of  to  the  nearest 
neighbors  of  .  Let  A4(^ij  ^)  denote  the  k  closest  spectral  neighbors  to  x[^^ ,  as 
measured  by  the  P  distance,  and  let  Xj^{x[^^)  be  their  corresponding  coordinates. 

We  first  determine  the  common  neighbors  of  x[^^  and  ;  that  is,  we  compute 
the  set: 

(a;4 ,  x\f )  =  Jfe  {x\f)  n  Jfe  )  • 

The  remaining  neighbors  are  then  paired  so  as  to  minimize  the  total  spectral 
difference.  More  specifically,  define 

4  = 


6 


and  let 


■^k{4f)  =  {4"^'  ^  ^k{x^u)  ■  (^'>/)  ^  '^k{x\\\xYA}  =  { 


(1)  .  7  — 


^7 


We  also  set  Sk'  to  be  the  set  of  permutations  on  k'  elements.  Finally,  for  each 
coordinate  pair  G  X,  we  compute 


dnn  i^ij  5  ^ 


(1)  ^(2)^  _ 


)  =  min 


^ij 

.  El 

7=1 


db';  1) 


—  X 


hi;  2)  I 
cr(7)  I 


(5.1) 


The  nearest  neighbor  distance  map,  gives  a  measure  of  how  much  change 
occurred  between  and  . 

O  f'J 


6  Examples 

6.1  Patch  A 

We  consider  the  following  100  x  100  x  113  patch,  for  which  pseudo-color  images 
from  day  one  and  day  two  are  depicted  in  figure  1. 


Figure  1:  Pseudo-color  images  of  patch  A 


(a)  Day  1 


(b)  Day  2 


6.1.1  Simple  Solutions 

The  inner  product  map,  defined  in  (2.1),  and  the  distance  map,  d^2, 
defined  in  (2.2),  are  shown  in  figure  2. 

The  total  distance  map,  dtotai  Pi  defined  in  (2.3),  is  given  in  figure  3. 

The  PC  A  distance  map,  dpcA  Pi  defined  in  (2.4),  was  computed  with  d  =  7 
dimensions,  the  minimum  number  required  so  that  both  days’  mappings  retain 
at  least  99%  of  the  information.  The  result  is  given  in  figure  4. 
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Figure  2:  dip  and  dp  similarity  measures  for  patch  A 


(b)  dp 


Figure  3:  dtotai  P  similarity  measure  for  patch  A 


Figure  4:  dpcA  P  similarity  measure  for  patch  A 


6.1.2  Local  coordinates  for  change  detection 

We  ran  the  algorithm  described  in  section  3  with  r  =  2.  The  dtotai  r]r  graph 
similarity  measure,  defined  in  (3.6),  is  shown  in  figure  5. 

For  the  pr,e,t  diffusion  similarity  measure,  defined  in  (3.9),  we  computed  the 
diffusion  maps  using  5  =  1,  t  =  1,  and  retained  s  =  20  eigenmaps;  the  results 
are  given  in  figure  6.  Note  that  the  sign  of  the  eigenvectors  was  not  corrected 
(as  described  in  section  7.1)  for  this  particular  experiment. 

We  also  computed  the  pr,e,t  diffusion  similarity  measure  using 

e  =  =  median  G  j| , 

t  =  1,  and  retained  5  =  6  eigenmaps;  the  results  are  given  in  figure  7. 


Figure  5:  dtotai  'qr  =  2)  local  similarity  measure  for  patch  A 


Figure  6:  Pr,s,t  (r*  =  2,  e  =  1,  t  =  1,  5  =  20)  local  diffusion  similarity  measure 
for  patch  A 
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Figure  7:  Pr,£,t  (r  =  2,  5  =  t  =  1,  5  =  6)  local  diffusion  similarity  measure 
for  patch  A 


6.1.3  Global  coordinates 

We  ran  the  algorithm  described  in  section  4  with  /cq  =  50.  The  total  difference 
map,  dtotai  7700, fco  patch  A  is  given  in  figure  8. 

We  also  computed  the  corresponding  diffusion  maps,  with 

e  =  e(«)  =  median  :  (hJ),  (*'>  j')  ^  x| , 

varying  values  of  t,  and  5  =  6.  We  synced  the  eigenmaps  according  to  the 
method  outlined  in  section  7.1.  The  resulting  diffusion  similarity  maps,  Poo,ko,E,t 
are  given  in  figure  9. 


Thirdly,  we  computed  the  corresponding  diffusion  maps  by  symmetrizing  the 
kernel  according  to  section  3.3.  We  used  defined  correspondingly  as: 


e  =  e(«)  =  median  +  »?oo,fco(4“'’4“4  :  ihj),  (*'>/)  ^  l}  , 


and  varied  the  values  of  t.  We  used  5  =  100  eigenmaps,  which  were  synchronized 
according  the  method  outlined  in  section  7.2.  The  resulting  diffusion  similarity 
maps,  poo,ko,£,t  are  given  in  figure  10. 


6.1.4  Comparing  nearest  neighbors 

We  ran  the  algorithm  described  in  section  5  with  k  =  20  and  computed  the  map 
dnn  for  patch  A.  The  results  are  given  in  figure  11. 


6.2  Patch  B 

We  consider  the  following  96  x  119  x  113  patch,  for  which  pseudo-color  images 
from  day  one  and  day  two  are  depicted  in  figure  12. 
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Figure  8:  (itotai  r]oo  kQ  i^o  =  50)  global  similarity  map  for  patch  A 


6.2.1  Simple  Solutions 

The  inner  product  map,  dip,  defined  in  (2.1),  and  the  1“^  distance  map,  d^2, 
defined  in  (2.2),  are  shown  in  figure  13. 

The  total  distance  map,  dtotai  ^2,  defined  in  (2.3),  is  given  in  figure  14. 

The  PC  A  1“^  distance  map,  dpcA  ^2,  defined  in  (2.4),  was  computed  with  d  =  6 
dimensions,  the  minimum  number  required  so  that  both  days’  mappings  retain 
at  least  99%  of  the  information.  The  result  is  given  in  figure  15. 

6.2.2  Local  coordinates  for  change  detection 

We  ran  the  algorithm  described  in  section  3  with  r  =  2.  The  dtotai  r]r  graph 
similarity  measure,  defined  in  (3.6),  is  shown  in  figure  16. 

For  the  pr,e,t  diffusion  similarity  measure,  defined  in  (3.9),  we  computed  the 
diffusion  maps  using  5  =  1,  t  =  1,  and  retained  s  =  45  eigenmaps;  the  results 
are  given  in  figure  17.  Note  that  the  sign  of  the  eigenvectors  was  not  corrected 
(as  described  in  section  7.1)  for  this  particular  experiment. 

We  also  computed  the  pr,e,t  diffusion  similarity  measure  using 

e  =  6^“^  =  median  47')^  '  (kj),  (*',  j')  e  j| , 

t  =  1,  and  retained  s  =  4  eigenmaps;  the  results  are  given  in  figure  18. 

6.2.3  Global  coordinates 

We  ran  the  algorithm  described  in  section  4  with  ko  =  bO.  The  total  difference 
map,  dtotai  7700, fco  patch  B  is  given  in  figure  19. 
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Figure  9:  poo,ko,s,t  (^o  =  50,  e  =  t  varying,  s  =  6)  global  diffusion  similarity 
maps  for  patch  A 


(e)  t  =  16 


(f)  t  =  32 
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Figure  10:  Symmetrized  and  synchronized  pQo,ko,e,t  (ko  =  50,  e  =  t  varying, 
s  =  100)  global  diffusion  similarity  maps  for  patch  A 


(g)  t  =  64 


(h)  t  =  128 


Figure  11:  dnn  nearest  neighbor  similarity  measure  for  patch  A 


Figure  12:  Pseudo-color  images  of  patch  B 


(a)  Day  1  (b)  Day  2 


Figure  13:  dip  and  dp  similarity  measures  for  patch  B 


(b) 
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Figure  14:  (itotai  P  similarity  measure  for  patch  B 


Figure  15:  (ipcA  P  similarity  measure  for  patch  B 


Figure  16:  (itotai  r]r  (^  =  2)  graph  similarity  measure  for  patch  B 
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Figure  17:  pr,s,t  (r  =  2,5  =  l,t  =  l,5  =  45)  local  diffusion  similarity  measure 
for  patch  B 


Figure  18:  pr,s,t  (r  =  2,  5  =  t  =  1,  5  =  4)  local  diffusion  similarity  measure 
for  patch  A 
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We  also  computed  the  corresponding  diffusion  maps,  with 


e  =  =  median  j)’  ^  > 

varying  values  of  t,  and  5  =  8.  We  synced  the  eigenmaps  according  to  the 
method  outlined  in  section  7.1.  The  resulting  diffusion  similarity  maps,  poo,ko,e,t 
are  given  in  figure  20. 


Figure  19:  dtotai  ?7oo  ko  (^o  =  50)  global  similarity  map  for  patch  B 


6.2.4  Comparing  nearest  neighbors 

We  ran  the  algorithm  described  in  section  5  with  k  =  20  and  computed  the  map 
dnn  for  patch  B.  The  results  are  given  in  figure  21. 

6.3  Alternate  patch  B 

We  modified  patch  B,  day  1,  by  placing  a  water  pixel  at  row  37,  column  58. 
This  pixel  was  originally  vegetation.  We  then  reran  some  of  the  experiments. 
Results  are  given  below. 

6.3.1  Simple  solutions 

The  PC  A  P  distance  map,  dpcA  ^2,  defined  in  (2.4),  was  computed  with  d  =  6 
dimensions,  the  minimum  number  required  so  that  both  days’  mappings  retain 
at  least  99%  of  the  information.  The  result  is  given  in  figure  22. 

6.3.2  Global  coordinates 

We  used  the  same  settings  as  described  in  section  6.2.3.  The  resulting  dtotai  r]oo  ko 
map  is  given  in  figure  23,  while  the  diffusion  similarity  maps  poo,ko,£,t  are  given 
in  figure  24. 
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Figure  20:  Poo,ko,s,t  (^o  =  50,  5  =  t  varying,  5  =  8)  global  diffusion 

similarity  maps  for  patch  B 


(a)  t  =  1  (h)  t  =  2 


(c)  t  =  4 


(d)  t  =  8 


(e)  t  =  16 


(f)  t  =  32 
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Figure  21:  dnn  nearest  neighbor  similarity  measure  for  patch  B 


Figure  22:  dpcA  P  similarity  measure  for  alternate  day  1  patch  B 
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We  also  computed  the  diffusion  maps  by  symmetrizing  the  kernel  according  to 
section  3.3.  We  used  defined  correspondingly  as: 


5  =  =  median  <  r]ooM  ( 


and  varied  the  values  of  t.  We  used  s  =  100  eigenmaps,  which  were  synchronized 
according  the  method  outlined  in  section  7.2.  The  resulting  diffusion  similarity 
maps,  Poo,/co,£,t  given  in  figure  25. 

Figure  23:  dtotai  k  (^o  =  50)  global  similarity  map  for  alternate  day  1  patch 
B 
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Figure  24:  Poo,ko,s,t  (^o  =  50,  5  =  t  varying,  5  =  8)  global  diffusion 

similarity  maps  for  alternate  day  1  patch  B 


(a)  t  =  1 


(h)  t  =  2 


(d)  t  =  8 


(e)  t  =  16 


(f)  t  =  32 
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Figure  25:  Symmetrized  and  synchronized  Poo,ko,s,t  (^o  =  50,  5  =  t  varying, 
s  =  100)  global  diffusion  similarity  maps  for  alternate  day  1  patch  B 


(c)  t  =  4 


(d)  t  =  8 


(e)  t  =  16 


(f)  t  =  32 
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(g)  t  =  64 


(h)  t  =  128 


7  Appendix 

7.1  Synchronizing  the  sign  of  eigenmaps 

When  MATLAB  computes  the  eigenmaps  in  the  diffusion  maps  algorithm,  they 
are  unique  up  to  a  sign  change.  When  dealing  with  one  data  set,  this  is  not 
much  of  an  issue.  However,  when  dealing  with  two  data  sets,  and  subsequently 
examining  the  ^^-difference  between  each  days’  diffusion  coordinates,  this  can 
lead  to  inaccuracies  when  trying  to  detect  change.  In  order  to  address  this  issue, 
we  have  implemented  the  following  method  for  correction. 


Given  the  two  sets  of  eigenvectors,  we  compute  the  following  5  inner  products: 


= 


i  =  1,...,5. 


Note  that  the  range  of  d  is  [—1,1],  and  that  if  |'d(i)|  is  near  one,  then  the 
eigenvectors  and  are  highly  correlated.  However,  if  <  0  as  well, 
then  when  computing  the  difference,  these  highly  correlated  eigenvectors  will 
actually  appear  far  apart.  Therefore,  we  make  the  following  adjustment  to  the 
signs  of  the  eigenvectors  of  day  2: 


119(2)  I  >  0  and  19(2)  <  0  ^ 


where  0  is  some  preset  measure  of  correlation. 


Note:  In  the  examples  that  employ  this  method,  we  have  set  0  =  0.75. 


7.2  Synchronizing  the  eigenmaps  for  maximum  correla¬ 
tion 

We  can  take  the  method  outlined  above  one  step  further,  and  synchronize  the 
eigenmaps  in  order  to  maximize  their  correlation.  To  do  this,  we  compute  the 
following  permutation  (note  that  the  eigenvectors  have  norm  one): 

s 

a  =  arg  max  V  KV’f 

cr  ^  ^ 

1=1 

We  then  map  the  day  two  eigenvectors  as  follows: 

Finally,  after  permuting  the  day  two  eigenvectors,  we  make  the  appropriate  sign 
changes  if  the  angle  between  two  eigenvectors  is  greater  than  90  degrees: 
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We  also  permute  the  eigenvalues  accordingly.  First: 


cf  ^  -  c 


(2) 

a(i)' 


And  then  secondly,  after  the  permutation: 
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1  How  this  manual  is  organized 


The  goal  of  this  manual  is  to  guide  a  first-time  user  through  the  installation,  setup  and 
first  steps  in  using  HyperSpectral  Explorer,  gaining  progressively  knowledge  of  the 
capabilities  of  the  software,  and  ultimately  to  master  all  its  capabilities. 
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2  Scope  &  Objectives 

The  main  objectives  of  HyperSpectral  Explorer  are  to  provide  the  following: 

2.i  Data  normalization,  compression  and  denoising 

Interface  for  initial  manipulation  of  the  collected  data. 

The  type  of  manipulations  we  are  referring  to  may  include  data  denoising,  compression, 
normalization,  computation  of  svd  of  the  data,  and  possibly  other  data 
selection/enanchement  algorithms . 

2. a  Data  Exploration 

Interface  for  algorithms  for  data  exploration 

This  involves  the  development  of  a  user-friendly,  but  still  powerful  (for  the  most 
knowledgeable  user)  interface  to  various  algorithms  (either  developed  by  the  group  or 
described  in  papers  etc...)  for  the  exploration  of  data:  among  these,  e.g.,  projections  on 
particular  coordinates  (svd,  random  projections),  mainly  for  dimensionality  reduction 
purposes;  nonlinear  transformations  (kernel  eigenmap)  for  nonlinear  dimension 
reduction/parametrization;  data  (labelled/unlabelled)  dynamic  exploration  in  various  sets 
of  coordinates  (linear/non-linear) 

2.iii  Learning/discovering  structures  in  the  data 

Interface  for  algorithms  for  discrimination 

Development  of  a  user-friendly  interface  to  algorithms,  developed  by  the  group  or 
elsewhere,  supervised  or  unsupervised,  for  data  clustering  and  other  discrimination  tasks. 
Learning  algorithms  may  include  Local  Discriminant  Bases  for  supervised  feature 
selection,  fast  hierarchical  clustering  techniques,  unsupervised  or  partially  supervised, 
non-linear  CART  based  on  Laplacian  eigenfunctions  etc.. .The  output  of  these  algorithms 
will  be  avilable  to  the  user  for  further  manipulations  and  exploration. 

2.iv  Data  sonification 

This  part  of  the  user  interface  will  allow  the  user  to  map  the  original  data,  or  any  of  its 
representations  in  feauture  spaces,  to  various  sound  spaces,  using  tunable  maps  and  a 
choice  among  different  sound  spaces.  This  allows  the  exploration  of  the  data  through 
sounds,  allowing  the  sonification  of  features  as  well  as,  possibly,  the  discovery  of  features 
through  sound  exploration. 

This  capabilities  may  not  be  available  on  all  installations  or  on  all  configurations. 
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3  Installing  the  HyperSpectral  Explorer 

3.i  Requirements 

HyperSpectral  Explorer  requires  Windows  2000  or  XP  operating  system. 

Version  1.0  of  HyperSpectral  Explorer  also  requires  a  licenced  version  of  Matlab  6  (rev 
13)  to  be  installed  locally,  with  license  server  either  local  or  remote.  Use  of  remote 
Matlab  servers  has  not  been  tested. 

The  current  version  of  Hyperspectral  Explorer  is  not  compatible  with  versions  of  Matlab 
superior  to  version  6  (release  13),  since  Mathworks  does  not  guarantee  forward 
compatibility  of  the  Matlab  library  functions. 

The  user  needs  administrator  privileges  to  install  and  run  HyperSpectral  Explorer. 
Moreover  the  user  needs  the  maximum  COM/DCOM/COM+  privileges:  contact  your 
system  administrator  if  you  are  not  sure  whether  you  have  these  and/or  how  to  set  them. 

HyperSpectral  Explorer  also  requires  Microsoft  XML  Parser  to  be  installed.  This  is 
already  installed  during  most  installations  of  Windows  XP,  otherwise  it  needs  to  be 
installed  prior  to  the  installation  of  HyperSpectral  Explorer.  The  installation  package  can 
be  found  at: 

http://www.microsoft.com/downloads/details.aspx?FamilyID=c0f86022-2d4c-4162-8fb8- 

66bfc  12f32b0&displavlang=en 

or  by  Googling  “Microsoft  MSXML  parser  download”. 

3.ii  Starting  the  instaiiation  process 

On  the  installation  disk,  simply  run  the  executable  fde  called  setup.exe.  This  will  start  the 
installation  process. 
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3.iii  Guide  through  the  installation  process 


InstallShield  Wizard 

- g 

Customer  Information 

Please  enter  your  information. 

B 

Please  enter  your  name,  the  name  of  the  company  for  whom  you  work  and  the  product 
serial  number. 


User  Name: 


Company  Name: 

I - 

Serial  Number: 


Ins.tallShi^d  - 


<  Back 


Newt  > 


Cancel 


Enter  you  name,  company  name,  and  any  serial  number. 


Select  a  directory  where  to  install 
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Installs  hie  Id  Wizai  d 


Select  Program  Folder 

Please  select  a  program  folder. 


Setup  will  add  program  icons  to  the  Program  Folder  listed  below.  You  may  type  a  new  folder 
name.,  or  select  one  from  the  existing  folders  list.  Click  Next  to  continue. 

Program  Folders: 

|HyperSpectral  Explorer 

Existing  Folders: 


Aee^estsri^. 
Administrative  T  ools 
Games 
Math 

PrintMe  Internet  Printing 

Programming 

Startup 

Utilities 


InstallShreld  - 


<  Back 


Next  > 


Cancel 


Setup  will  install  HiJperSpectral  Explorer  in  the  following  folder. 

T 0  install  to  this  folder,  click  Next.  T o  install  to  a  different  folder,  click  Browse  and  select 
another  folder. 


-Destination  Folder - 

C:\...\Plain  Sight  Svstems\HvperSpectral  Explorer  Browse... 

InslallShield - 


Select  a  folder  w 


<  Back 


Next  > 


1 


Cancel 


lere  to  install  the  shortcut  to  Hyperspectral  Explorer. 


The  setup  program  will  then  install  the  Hyperspectral  Explorer  files  in  the  specified 
directory,  add  the  shortcuts  to  the  specified  folder  and  add  a  shortcut  on  the  desktop. 

The  software  is  now  ready  to  run. 
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4  Starting  HyperSpectral  Explorer 

4.i  Starting  the  installation  process 

Simply  click  on  the  icon  in  the  PlainSight  System  folder  created  in  the  Start  Menu  during 
the  installation,  or  on  the  icon  placed  on  the  desktop: 


HyperSpeetral 

Explorer 

If  neither  ieon  is  available,  the  exeeutable  will  be  in  the  installation  directory  and  is 
named  prHyperSpectralExplorer.exe 

4.11  Loading 

During  the  loading  phase,  the  HyperSpeetral  Explorer  will  show  progress  messages  on 
splash  screen  (eolors  may  very  depending  on  the  system  settings): 


Loading  time  will  vary  aeeording  to  the  speed  of  the  maehine,  memory  available,  seeurity 
settings  and  Matlab  installation  type  (on  the  machine  or  remote).  During  loading, 
HyperSpectral  Explorer  will  connect  to  the  currently  registered  version  of  Matlab  (either 
on  that  maehine  or  a  remote  machine).  This  is  usually  the  longest  proeedure  during 
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loading. 
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5  Organization  of  the  User  Interface 

5.i  General  Overview 


Below  is  how  HyperSpectral  Explorer  presents  itself  after  loading  (minor  differenees  may 
depend  on  your  system  configuration,  the  screen  is  divided  into  three  main  parts: 


The  three  parts  can  be  resized  by  using  the  vertical  green  slidebar  separating  the  Object 
PageViewer  from  the  Object  Tree  and  the  Algorithm  Browser,  while  the  horizontal  blue 
slidebar  can  be  used  to  redistribute  the  space  between  the  Object  Tree  and  the  Algorithm 
Browser. 


At  the  top  of  the  window  there  is  a  menu  with  some  standard  functionalities,  while  below 
the  menu  there  is  a  very  useful  toolbar,  whose  buttons  allow  to  perform  most  actions  on 
the  objects  in  the  workspace. 
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5.  a  The  Object  Tree 


The  Object  Tree,  top  left,  represents  the  objects  currently  available  to  the  user,  divided 
into  conceptual  categories  or  groups.  Initially  the  tree  is  empty,  only  the  object  catogories 
appear.  When  objects  are  loaded,  manipulated  and  created,  they  appear  in  the  branches  of 
tree.  Right-clicking  on  various  objects  in  the  tree  will  activate  pop-up  menus  that  allow 
the  user  to  perform  actions  on  the  objects  and  make  them  interact. 


R-  Datacubes 

^  Examplel 

;  •  Examplel  OnSVDBasis_0 

.  NarmalizedData 

.  ^  NGrmalizedData_mdO 

^  Eigenimage 


^  ^Examplel  OnLdbBasi. 

Add  to  other  groups 

R-  TrainingDatacubes 

Examplel 

Expand  on  vectors  ► 

H  LabelSets 

|i 

Apply  nonlinear  map  ► 

[ 

NormalizedData_ 

ifp  LabelSetO 

S  Extract  subset 

R-  TrainingSets 

[X|  Extract  random  subset 

LabelSetO_DataO 

■■  LabelSetO_DataOOnL 

►12  View  Description 

R-  Expansions 

►2  Log 

NormalizedData_md[ 

View  in  new  window 

R-  ^  Classifications 

O  5ave 

^  NNCIassification 

_  t 

Delete  object 

R-  Bases 

SVDEiasis 
:  ■  LdbBasis 

■  LdbBasis_Sub_0 

R  NonlinearUaps 

NormalizedData_mdO_eigenmap 
■  Sonification 


The  object  tree  at  the  end  of  the  tutorial  in  this  manual  may  look  as  in  the  picture  below. 
Many  objects  have  been  created  during  the  analysis  of  one  initial  data  cube  “Examplel”. 
Some  of  then  represent  data  cubes,  but  the  third  dimension  has  a  meaning  which  is  not  a 
spectral  frequency,  but  a  probability,  or  a  coefficient  of  a  projection  onto  some  features. 
There  are  also  orthogonal  bases  among  the  objects  created,  as  well  as  nonlinear  maps,  and 
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training  sets.  Popup  menus  are  accessed  by  selecting  an  element  and  right-clicking. 
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5. Hi  The  Object  PageViewer 

The  Object  PageViewer  is  a  set  of  tabs,  divided  conceptually  in  the  same  categories  as  the 
objects  in  the  object  tree,  on  each  of  which  certain  types  of  objects  are  visualized.  The 
categories  are  a  (possibly  strict,  depending  on  system  configuration)  subset  of: 

-  Preview:  when 

connected  to  the 
hardware,  on  this  page 
the  user  will  be  able  to 
preview  the  images 
captured  by  the 

instrument,  collect 
pictures,  set  the 

parameters  for  the 
collection,  etc. ..This 
option  may  or  not  be 
available  depending  on 
the  connected  hardware 
and  on  the  installation 
options  for  the 

hardware  driver. 

-  Datacubes:  on  this  page 
datacubes  and  other 
data  are  represented. 

For  example  when  the 
user  loads  a  datacube,  a 
viewer  for  the  loaded 
object  is  displayed  on 
this  page.  Essentially  on 
this  page  there  should 
be  all  the  “working 
objects”  the  user  wants 
to  manipulate. 

-  Training:  on  this  page 
the  datasets  that  the 
user  chooses  to  use  for  training  are  displayed.  These  datasets  are  usually  imported 
from  the  set  of  “working  objects”  in  the  datacubes  page.  Objects  on  the  datacubes 
page  can  be  moved  onto  the  training  page  by  right-clicking  on  the  datacube  object 
in  the  object  tree  and  choosing  the  action  “Add  to  training  set”. 

-  Classifications:  the  data  on  this  page  is  the  results  of  a  classification  task.  It  can  be 
a  set  of  vectors  with  corresponding  labels,  represented  with  points  in  a  vector  space, 
with  colors  indicating  the  corresponding  labels. 

-  Bases:  the  data  on  this  page  represents  basis  objects,  e.g.  the  basis  vectors  resulting 
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from  the  run  of  a  linear  diseriminant  basis  algorithm. 

Expansions:  the  data  on  this  page  represents  the  image  of  some  datacube  objeet 
under  a  linear  or  nonlinear  mapping.  For  example  the  projeetion  of  a  datacube 
object  onto  its  SVD  basis. 

Sonification:  the  data  on  this  page  is  for  sonification  purposes.  Not  available  in  all 
configurations. 
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5./V  The  Algorithm  Browser 

The  Algorithm  Browser,  bottom  left,  is  a  multi-page  collection  of  user  interfaces  linking 
to  various  algorithms  the  user  can  apply  on  the  available  objects.  It  contains  various 
pages,  depending  on  system  (hardware/software)  configuration,  among  the  following: 

-  Preview,  parameters  and  actions  related  to  the  preview  and  collection  of  new 
datacubes.  Not  available  in  all  configurations. 

-  Train:  interface  for  building  and  merging  labelled  training  sets. 

-  Algorithms:  interface  to  various  algorithms  that  act  on  datacube  objects  and/or  on 
training  objects. 

-  Sonification:  interface  to  various  sonification  algorithms. 
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Access  training  page 


Refresh  list  of  inputs. 


List  of 
algorithms 


Inputs  to  algorithm 


Help  button 


Algorithms 


n  Sonification 


Nearest  Neighbor  Classifier 


j^Algorifhm  pages  [p,j|  - 


Cl 


o 

a 


CD 

a 


§ 


Labelled  T rainin^et 
Classify 


I  Labels  etO_D  ataOO  nLdbB  asis_^  ▼  | 

0 


□  E  xamplel  0  nS  VD  B  asis_0 

□  NormalizedData 
G  NDrmalizedData_rndO 

□  Ejgenimage 
Ej<aB9p|Bl  0  nLdbB  asis_S  ub_E 


v| 


“  Parameters - 

Estimation  method 
Metric: 


o 

n 


[15^ 


euclidean 


nearest  neighbors 


Run 


Parameters  of  the  algorithm 


Run  the  algorithm 
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6  Main  Menu 


6.i  File  Menu 
Open 

open  an  existing  Matlab,  HSE,  or  PND  file  into  HyperSpectral  Explorer. 

Save 

Saves  all  the  files  in  the  current  workspace  to  HSE  format  files. 

Save  Matlab  workspace 

Saves  all  the  variables  currently  loaded  in  Matlab  to  a  Matlab  file. 

Exit 

Exits  from  HyperSpectral  Explorer. 

6.11  External  Programs  Menu 
Connect  to  Matlab 

Connect  to  the  Matlab  server  registered  on  the  machine  (local  or  network  matlab  server). 

Disconnect  from  Matlab 

Disconnect  from  the  Matlab  server  HyperSpectral  Explorer  is  currently  connected  to.  This 
will  cause  all  the  variables  currently  in  Matlab  to  be  freed  from  memory  and  permantently 
lost. 

Matlab  -  Reset  random  number  generator 

Resets  the  random  number  generator  in  Matlab.  This  affects  the  runs  of  randomized 
algorithms. 

S.ili  View  Menu 
View  application  log 

Views  the  application  log  file,  containing  all  the  actions  of  the  application  since  startup. 

6.lv  Windows  Menu 
Tile  horizontally 

Tiles  the  windows  in  the  current  page  of  the  Object  PageViewer  horizontally. 
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Tile  vertically 

Tiles  the  windows  in  the  current  page  of  the  Object  PageViewer  vertically. 

6.V  Help  Menu 
Help 

Shows  this  help  in  a  window  in  HyperSpectral  Explorer. 

About 

Shows  the  program  title,  version,  authors. 
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7  Getting  Started:  an  example 

We  show  some  capabilities  of  the  HyperSpectral  Explorer  through  an  example. 

7./  Starting  up  and  loading  a  file 

Start  up  the  HyperSpectral  Explorer 

Click  on  the  Open  button  at  the  top  left  (or  select  the  Open  command  from  the  File 
menu) 


The  dialog  box  represented  below  will  appear: 


HypetSr 


Hardware  not  connected 


Go  in  the  directory  “Examples”  in  the  installation  directory  of  Hyperspectral  Explorer. 
Select  “Matlab  files”  in  the  file  type  dropbox. 

Choose  the  file  named  Example  I. mat  and  click  ok. 

The  dialog  box  below  will  appear,  asking  which  group  the  loaded  variable  should  be 
assigned  to.  Each  object  can  be  assigned  to  one  or  more  data-groups,  provided  it  is 
compatible  with  them.  Data-groups  are  useful  to  divide  the  various  loaded  objects  into 
classes  with  different  properties,  different  roles  in  the  algorithsm  and  different  conceptual 
types. 
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In  the  specific  case,  we  assign  the  new  variable  to  the  “Datacubes”  group.  We  will  see 
later  how  to  assign  the  variable  to  other  groups. 

Click  Ok  to  load  the  data  (this  may  take  a  few  seconds  depending  on  the  specifications  of 
your  system,  in  particular  cpu  speed  and  available  memory). 


PLAIN  SIGHT  SYSTEMS  -  CONFIDENTIAL  AND  PROPRIETARY 


7. a  Browsing  through  the  data 


The  file  will  be  loaded,  and  object  called  Example  1  will  be  added  to  the  Datacubes 
branch  in  the  object  tree,  and  a  Viewer  for  the  object  will  be  added  to  the  Datacubes  page 
in  the  Object  PageViewer.  The  algorithm  browser  will  switch  to  the  Datacubes  page: 


I  i 

i 

1  — 
I  i 

S  _ 


i 


Let  us  browse  through  the  data  to  see  how  the  data  cube  browser  works. 


The  dropbox  at  the  top  left  allows  for 
different  colouring  of  the  datacube.  A 
single  slices  can  be  viewed  in  gray  scale, 
rgb,  hsv  and  arbitrarily  tuned  color  maps, 
while  multiple  spectral  slices  can  be 
combined  in  two  ways:  three  of  them  can 
be  combined  by  mapping  them  to  rgb 
(option  'custom'),  or  arbitrary  linear 
combinations  of  slices  can  be  taken  with 
the  'equalizer'  option. 


The  highlighted  button  sums  all  the  slices 
and  display es  theirs  sum  in  grayscale.  For 


datacubes,  this  essentially  shows  the  total 
white-light  response. 
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The  dropbox  at  the  top  left  allows  for 
different  colouring  of  the  datacube.  A 
single  slices  can  be  viewed  in  gray  scale, 
rgb,  hsv  and  arbitrarily  tuned  color  maps, 
while  multiple  spectral  slices  can  be 
combined  in  two  ways:  three  of  them  can 
be  combined  by  mapping  them  to  rgb 
(option  'custom'),  or  arbitrary  linear 
combinations  of  slices  can  be  taken  with 
the  'equalizer'  option. 


This  dropbox  allows  for  visualizing 
different  slices  of  the  datacubes.  One 
dropbox  is  available  for  'gray',  'rgb', 
'hsv','tuner'  choices  of  coloring,  while  3  are 
avaible  when  'custom'  is  the  choice  of 
coloring 


The  'Show  colorbar'  button  toggles  the 
color  scale  on  the  side  of  picture  on  and 
off.  The  'Zoom'  button  toggles  on  and  off  a 
window  at  the  bottom  of  the  viewer  where 
the  spectra  of  the  selection  is  showed. 
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CAUTION:  The  red  button  for  closing  the  pictures  window  inside  the  browser 
should  never  be  used  to  close 


As  an  exercise,  you  can  reproduce  the  image  above  by  setting  the  parameters  in  the  data 
cube  browser  and  selecting  the  region  of  the  data  cube  indicated  above. 
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7. Hi  Running  your  first  aigorithm 

Let  us  go  through  some  manipulations  of  the  dataset. 

First  of  all,  we  observe  that  the  dimensionality  of  the  data  set  is  quite  high:  namely,  128. 
However,  these  spectra  are  noisy  and,  without  noise,  we  expect  them  to  be  smooth,  hence 
intrinsically  lower  dimensional.  We  thus  decide  to  compute  the  principal  components,  and 
to  project  onto  the  top  few  principal  components.  In  order  to  do  this,  we  select  the 
algorithm  tab  in  the  Algorithm  Browser  and  switch  to  the  tab  named  “PCA”. 

If  the  currently  loaded  datacube  does  not  appear  in  the  list  of  datacubes,  press  the  refresh 
button  at  the  right  of  the  datacube  list. 

Then: 

-  select  “Example  1”  in  the  list  of  input  datacubes, 

-  move  the  slidebar  corresponding  to  the  “Number  of  PC's”  to  10%,  which  means  we  want 
to  compute  only  10%  of  128=12  principal  components, 

-  move  the  slidebar  corresponding  to  “Number  of  random  samples”  to  8%,  which  means 
that  the  principal  components  will  be  computed  on  a  random  subset  of  the  spectra  only,  of 
size  8%  of  151*151  spectra  which  is  approximately  1800  spectra  out  of  about  22500.  We 
expect  this  to  be  rather  accurate  because  the  structure  of  the  spectra  does  not  seem  that 
complicated.  The  screen  should  now  more  or  less  look  like  this: 


[idle _ 1 _ bardware  not  connected 
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Now  hit  the  Run  button  at  the  bottom  of  the  algorithm  page. 

You  will  be  asked  for  a  name  for  the  variable  being  created  by  the  algorithm,  which  is 
going  to  be  an  orthonormal  set  of  vectors.  We  are  going  to  call  it  “SVDBasis”.  After 
typing  the  name  of  the  variable,  click  Ok. 


After  running,  the  algorithm  will  create  a  variable  called  “SVDBasis”,  in  the  group 
“Bases”. 

Double  click  on  “SVDBasis”  in  the  Object  Tree,  and  you  will  see  the  basis  vectors 
displayed  in  the  “Bases”  page  view: 
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We  have  12  basis  vectors,  as  you  can  see  from  the  picture.  You  can  also  select  SVDBasis, 
right  click  and  select  “View  Description”  to  obtain  information  about  this  variable. 
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7./V  Linear  projections 

Let  us  go  through  some  manipulations  of  the  dataset. 

Now  we  want  to  project  the  original  data  cube  onto  this  basis,  thus  reducing  the 
dimensionality  from  128  to  12.  This  can  be  done  in  two  equivalent  ways,  and  in  general 
all  the  actions  on  any  object  can  be  performed  analgously  in  these  two  ways. 
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Using  popup  menus:  select 
Example  1  int  the  Object 
Tree  and  right-click  with 
the  mouse. 

A  popup  menu  like  in  the 
picture  will  appear.  Select 
“Expand  on  vectors”  and 
then  “SVDBasis”  in  the 
submenu. 
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Using  the  toolbar  buttons: 
select  Example  1  int  the 
Object  Tree  and  then  click 
on  the  button  “Expand  on 
basis”  (you  can  see  the 
hints  for  each  button 
simply  by  resting  the 
mouse  on  the  button  for  a 
fraction  of  a  second):  the 
list  of  available  bases  to 
project  onto  will  drop 
down:  click  on 

“SVDBasis”. 
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Accept  the  choiee  of  name  for  the  new  variable. 

The  algorithm  will  quickly  run.  Double-click  on  “Example  10nSVDBasis_0”  to  see  a 
browser  for  the  result,  or  simply  switeh  to  the  Datacube  Page. 
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Browse  through  the  various  layers  of  the  cube:  there  are  12  of  them,  one  per  principal 
component  computed,  since  each  layer  is  the  projection  of  the  data  cube  onto  the 
corresponding  principal  component.  Observe  for  example  that  the  first  layer  essentially 
represents  the  projection  onto  light-intensity  (the  variable  with  greatest  variable  in  the 
data  cube),  the  other  represent  finer  and  finer  variables. 
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We  can  now  project  this  new  variable  on  the  unit  sphere:  select  the  “Normalization” 
algorithm  tab,  select  “Example  lOnSVDBasis  O”  as  input  datacube,  select  L2  in  the 
“Spectrum  Normalization”  dropbox,  and  click  the  Run  button. 

Call  “NormalizedData”  the  new  variable  when  prompted  for  a  name. 

The  resulting  data  cube  will  look  something  like  this: 
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l.y  Nonlinear  maps  and  projections 

Suppose  now  we  would  like  to  find  a  good  parametrization  for  the  spectra  in  this  data 
cube,  by  using  a  nonlinear  dimension  reduction  tool  such  as  the  Laplacian  eigenfunctions. 
The  dataset  is  quite  large,  and  while  feasible  on  a  standard  PC,  we  would  like  to  do  a 
quick  computation,  maybe  paying  a  little  in  terms  of  precision.  So  we  would  like  to 
extract  a  relatively  small  random  subset  of  “NormalizedData”,  compute  the  Laplacian 
eigenfunctions  on  this  subset,  and  then  extend  them  to  the  whole  set. 

In  order  to  do  this,  select  NormalizedData  in  the  Object  Tree. 

Either  right-clicking  and  select  “Extract  random  subset”,  or  click  the  corresponding 
button  on  the  toolbar.  Select  a  random  subset  size  of  about  1000-1500  samples,  and 
accept  the  default  name  “NormalizedData  mdO”  for  the  new  variable. 

Go  to  the  “Laplacian  Eigenfunctions”  algorithm  tab. 

Choose  “NormalizedData  mdO”  as  input  vector  set. 

Leave  all  the  parameters  as  set  by  default,  but  change  the  parameter  “Delta”  to  0.4.  This  is 
the  width  of  the  kernel  used,  since  all  the  points  in  “NormalizedData”  are  on  the  unit 
sphere,  we  want  to  localize  the  kernel  at  a  scale  smaller  than  the  unit  radius  of  that  sphere. 
Click  the  Run  button,  and  choose  “NormalizedData_mdO_eigenmap”  and 
“NormalizedData  mdO  eigenimage”  as  names  for  the  output  variables. 

The  Laplacian  algorithm  will  create  a  Nonlinear  map,  that  can  now  be  applied  (extended) 
to  any  dataset  of  the  correct  dimensionality,  and  also  the  result  of  applying  this  nonlinear 
map  to  the  original  data  “NormalizedData  mdO”  which  was  input  to  the  algorithm:  this  is 
a  new  variable  which  is  automatically  put  in  the  “Expansions”  group. 

Now  we  would  like  to  apply  (extend)  the  whole  map  to  the  whole  dataset.  We  select 
“NormalizedData”,  then  right-click  and  select  “Apply  nonlinear  map”  and 
“NormalizedData  mdO  eigenmap”  in  the  submenu.  Call  the  new  variable  “Eigenimage”. 
A  datacube  name  “Eigenimage”  is  created,  the  k-th  layer  representing  the  value  of  the  k-th 
eigenfunction  evaluated  on  the  spectmm  of  each  pixel.  Browse  through  the  datacube  to 
get  a  feeling  for  this  result. 
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7.vi  Supervised  algorithms:  LDB 

All  the  algorithms  we  have  been  running  till  now  were  unsupervised,  in  the  sense  that  no 
training  set  nor  classes  to  be  learnt  were  sought  nor  were  fed  into  the  algorithms.  When 
one  knows  which  classes  he  would  like  to  classify  and  is  able  to  indentify  members  of  that 
class,  those  can  be  used  for  training  purposes,  and  algorithms  can  seek  which  variables 
allow  for  good  predictions  of  the  desired  classes. 

We  consider  here  the  example  of  the  supervised  algorithm  called  Local  Discriminant 
Bases. 

The  first  step  is  to  construct  a  training  set.  To  do  so,  we  need  to  specify  one  or  more  data 
cubes  to  be  used  for  training  purposes.  We  can  pick  our  original  datacube  “Examplel” 
and  add  it  to  the  “TrainingDatacubes”  group.  To  this,  we  can  select  “Examplel”,  right- 
click  and  select  “Add  to  groups”  (the  same  result  can  be  obtained,  as  the  user  can 
probably  guess  by  now,  by  clicking  on  the  “Add  to  groups”  button  in  the  toolbar).  Then 
we  check  the  “Training  Datacubes”  group  and  click  Ok.  “Examplel”  gets  added  to  the 
“TrainingDatacubes”  group. 

We  now  select  the  “Train”  page,  a  tab  side  by  side  with  the  “Algorithms”  tab.  Now  we 
want  to  create  a  new  training  set,  ie  a  set  of  vectors  with  corresponding  labels.  The 
software  distingushes  between  two  concepts:  'label  set'  and  'training  set'.  The  first  one 
denotes  is  a  collection  of  labels  associated  to  a  subset  of  spectra  from  one  or  multiple 
Training  datacubes,  the  second  one  denotes  a  set  of  labelled  vectors,  however  obtained. 
Of  course  a  supervised  algorithm  will  act  only  on  a  training  set.  This  is  typically  obtained 
within  the  HyperSpectral  Explorer  by  creating  a  'label  set'  and  then  'building'  from  it  a 
training  set.  Let  us  go  through  this  process. 
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At  this  point  the  software  has  created  for  us  a  label  set  called  “LabelSetO”,  which  allows 
to  put  labels  on  points  in  the  datacube  “Example  1”  only.  If  we  had  other  datacubes  in  the 
“TrainingDatacubes”  group  we  would  be  able  to  add  more  of  them  to  the  pool  available  to 
this  label  set.  The  user  can  also  change  the  number  of  labels  in  the  label  set.  For  the 
moment,  3  is  the  default  and  is  good  for  our  purposes. 

Let  us  build  the  label  set  as  follows.  Select  a  region  from  the  datacube  as  represented  in 
the  picture  below  (green  is  label  1,  blue  is  label  2,  cyan  is  label  3),  and  label  it  accordingly 
by  clicking  on  the  corresponding  label  button  in  the  train  page.  If  a  selection  is  wrong, 
simply  select  a  region  that  is  wrong  and  click  the  “Un-Select”  button  to  erase  any 
selections  in  that  region. 

Try  to  select  more  or  less  the  regions  as  in  the  picture  below.  These  selections  are 
motivated  by  knowledge  from  pathology:  class  1  essentially  represents  nuclei,  class  2 
cytoplasma,  and  class  3  is  glass  or  other  not  very  interesting  stuff. 
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At  this  point  we  have  the  desired  label  set.  To  build  a  training  set,  ie  an  actual  set  of 
vectors  with  labels,  press  the  “Build  Training  Set”  button,  which  is  hammer- shaped,  and 
the  third  button  from  the  left.  A  TrainingSet  called  “LabelSet0_Data0”  is  created  in  the 
“TrainingSets”  group.  By  double-clicking  on  it  one  gets  a  picture  similar  to  this: 
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The  color  of  the  points  represents  the  class.,  and  a  legend  at  the  top  of  the  figure  is  a 
legend  for  the  color-class  correspondence.  The  points  of  different  classes  appear  all  mixed 
up.  But  in  which  coordinates  are  we  looking  at  them?  Well,  these  coordinates  are  the 
bottom  coordinates  of  the  spectrum,  which  are  quite  noisy,  so  it  is  not  a  surprise  we 
cannot  see  anything.  Play  a  little  with  changing  the  coordinates.  After  browsing  through 
various  coordinates  and  by  rotating  the  axes  a  little,  one  could  get  a  pictures  like  the 
following: 

which  looks  of  course  much  more  promising  towards  classifying  points  from  the  different 
classes.  Of  course  one  has  to  get  quite  lucky,  in  general,  to  find  such  good  coordinates. 
Local  Discriminant  Bases  is  able  to  find  coordinates  such  that  the  projection  onto  them 
best  preserves  the  discrimination  between  the  classes.  Each  coordinate  selected  by  the 
algorithm  is  in  general  a  complicated  linear  superposition  of  the  original  coordinates,  so  it 
is  much  more  flexible  of  what  we  just  tried,  which  was  to  select  just  3  among  the  original 
coordinates. 
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So  let's  go  back  to  the  “Algorithm”  tab  and  select  LDB.  Let's  keep  the  default  parameters 
(pretty  good  for  rather  general  problems),  and  click  Run.  Let's  call  “LdbBasis”  the  output 
of  the  algorithm.  When  the  algorithm  finishes  running,  we  have  a  new  basis,  of  128 
vectors,  ordered  by  “discrimination  power”.  Let's  select  only  the  top  8  vectors,  say,  by 
selecting  “LdbBasis”,  right-clicking  and  selecting  “Extract  Basis  Subset”,  and  ticking 
only  the  top  8  vectors.  Double-clicking  on  “LdbBasis  Sub  O”  reveals  the  discrimination 
vectors  found: 


The  vector  found  during  your  experiment  may  be  different,  since  they  depend  strictly  on 
the  training  set  you  chose. 
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Let's  project  the  original  data  cube  “Examplel”  onto  this  subset  of  the  LdbBasis  (select 
“Examplel”,  right-click,  select  “Expand  on  Vectors”,  then  LdbBasis_Sub_0)  thus  getting 
a  data  cube  called  “Example  lOnLdbBasis  Sub  O  O”.  It  is  instructive  to  browse  through 
it. 


PLAIN  SIGHT  SYSTEMS  -  CONFIDENTIAL  AND  PROPRIETARY 


J.vii  Supervised  algorithms:  Nearest  Neighbors 

We  can  view  “Example  10nLdbBasis_Sub_0_0”  as  intelligently  dimensionally  reduced 
version  of  “Example  1”,  where  the  dimensionality  reduction  was  training  set  driven,  for 
the  purpose  of  classification,  in  constrast  with  the  “blind”  principal  components  analysis 
which  compresses  the  data  without  any  classification  goal  or  information. 

At  this  point  we  can  run  a  nearest-neighbor  classifier  on  the  projection.  In  high  dimension 
it  would  be  essentially  bound  to  fail  because  of  the  high-dimensionality  combined  with 
noise,  but  in  lower  dimension  the  data  is  less  noisy,  because  of  projection  onto  smooth 
functions,  and  at  the  same  time  the  LDB  vector  try  not  to  lose  too  much  discriminant 
information. 

So  we  first  project  the  training  set  “LabelSetO_DataO”  onto  the  top  LDB  vectors 
LdbBasis_Sub_0  (select  “LabelSetO_DataO”,  right-click,  select  “Expand  on  Basis”,  select 
LdbBasis  Sub  O).  We  get  “Label SetO  DataOOnLdbBasi s  Suh  O  O”.  Double-click  on  it 
to  see  how  well  the  coordinates  found  by  LDB  separate  the  classes  of  the  training 
samples: 
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Then  we  go  to  the  “NN  Class.”  algorithm.  Select 
“LabelSetO  DataOOnLdbBasis  Sub  O  O”  and  “Example  10nLdbBasis_Sub_0_0”,  and 
leave  the  other  parameters  as  default.  Click  Run.  Call  the  output  “NNCIassification”.  A 
data  cube  will  be  created,  the  k-th  layer  representing  the  probability  of  each  pixel  to  be  in 
class  k,  and  browse  through  or  combine  the  layers: 
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8.  a  Definitions 

A  Label  Set  is  simply  a  set  of  labels  attached  to  corresponding  vectors  in  one  or  more 
datacubes.  It  is  an  enseble  of  graphical  objects,  not  a  set  of  vectors.  A  Label  Set  is  defined 
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by: 

-  the  set  of  Training  Datacubes  from  which  labelled  samples  can  be  selected, 

-  the  set  of  Labels  that  can  be  used  (usually  numbered  from  1  to  L) 

A  Label  Set  differs  from  a  Training  Set  in  that  a  Training  Set  is  the  set  of  vectors  (living 
in  some  N-dimensional  vector  space)  with  the  labels  attached,  which  can  then  be  input  to 
any  supervised  classification  algorithm.  After  defining  the  Label  Set  the  user  can  Build  a 
Training  Set  from  the  Label  Set  by  selecting  the  Build  button  from  the  Train  toolbar  at  the 
top  of  the  Train  page. 

8.iii  User  Interface 

Situated  at  the  top  of  the  Train  page  is  a  toolbar,  that  contains  four  icons: 

1)  New  label  set:  creates  a  new  label  set,  adding  it  to  the  list  of  available  Label  Sets. 

2)  Refresh  the  list  of  available  datacubes,  in  case  some  datacubes  have  been  recently 
loaded/moved  and  don't  appear  in  the  list  of  available  datacubes. 

3)  Build  a  training  set  out  of  Label  Set. 

4)  Add  a  Label  Set  to  an  existing  Training  Set. 

Below  the  toolbar,  there  is  a  combobox  listing  the  existing  Label  Sets. 

Below  the  combobox,  there  is  the  check  list  of  available  Training  Datacubes.  This  should 
contain  all  the  Training  Datacubes  showed  in  the  Object  TreeView,  if  it  does  not,  the  user 
can  force  an  update  by  clicking  on  the  Refresh  button  in  the  toolbar.  Each  Training 
Datacube  can  or  not  partecipate  in  a  Label  Set,  and  this  is  determined  by  whether  it  is 
checked  or  not.  If  a  Training  Datacube  is  not  checked,  the  user  will  not  be  allowed  to  add 
labelled  samples  from  that  Training  Datacube.  Otherwise,  selected  labelled  samples  from 
that  Training  Datacube  can  be  added.  If  a  Training  Datacube  is  un-checked  after  samples 
from  it  where  added  to  the  Label  Set,  those  samples  will  be  removed  from  the  Label  Set. 

Below  the  list  of  Training  Datacubes  is  a  control  to  select  the  number  of  labels.  This 
controls  how  many  different  labels  can  be  assigned  to  the  points.  When  then  number  of 
labels  is  decreased,  already  assigned  labels  are  not  erased,  but  they  will  not  be  considered 
when  building  a  training  set. 

8.iv  Constructing  a  Label  Set 

1.  Add/move  the  datacubes  that  you  want  to  use  in  the  Label  Set  to  the  Training 
Datacubes  group. 

2.  If  a  Label  Set  does  not  exist,  create  a  new  one  by  pressing  the  New  button  in  the  Train 
toolbar. 

3.  In  the  check  list  of  Datacubes  involved  check  the  Training  Datacubes  that  will 
partecipate  in  the  Label  Set  (can  be  changed  at  any  later  time).  If  the  list  does  not 
contain  all  the  available  Training  Datacubes,  click  the  Refresh  icon  in  the  toolbar  to 
force  an  update  in  the  list. 

4.  Select  the  number  of  labels  that  will  be  used  in  the  Label  Set. 

5.  The  screen  may  look  like  this: 
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6.  To  add  sample  to  a  certain  class,  click  on  the  Datacube  Browser:  the  mouse  cursor  will 
become  a  crosshair,  and  every  click  will  define  a  vertex  of  the  polygonal  region  you 
are  selecting.  When  you  are  finished  selecting  the  region,  double-click,  or  press  Enter. 

A  HyreiVi.e.-ttal 


7.  To  add  the  selected  samples  to  the  Label  Set  with  a  certain  label,  simply  press  the 
button  in  the  Labels  Palette  on  the  Training  page  corresponding  to  the  label  you  want 
to  assign.  The  selected  region  will  be  colored  with  a  color  corresponding  to  the  label. 

8.  If  you  are  unhappy  by  any  part  of  your  selection,  simply  select  a  region  that  you  want 
to  remove,  and  click  the  Un-select  button  in  the  Labels  Palette:  all  the  samples  in  the 
selected  region,  regardless  of  whichever  class  they  were  assigned  to,  will  be  removed 
from  the  Label  Set. 
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p"  HyperSpectral  Explorer 

File  Acquire  External  Programs  Options  View  Windows  Help 

af\a\  <1  X|H|>i|>i|  >i| 


9.  When  you  are  happy  with  your  selections,  press  the  Build  button  in  the  Train  page 
toolbar  to  build  a  Training  Set.  The  training  set  you  will  create  will  be  added  to  the 
group  of  Training  Sets  and  displayed  in  the  corresponding  page  of  the  Object 
Page  Viewer. 
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File  Acquire  External  Programs  Options  View  Windows  Help 
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9  The  Main  Toolbar 


9.i  Overview 

The  main  toolbar  contains  buttons  to  perform  most  of  the  available  actions  on  the  objects 
inside  HyperSpectral  Explorer. 


y 

\ 

m 

E3 

u 

^i| 

x| 

Here  is  a  description  of  their  functionality,  from  left  to  right. 

9.  a  Open 

opens  a  Matlab,  HSE  or  pnd  file.  The  same  as  the  command  accessible  from  the  main 
menu:  File-Open. 

9.iii  Save 

The  same  as  the  command  accessible  from  the  main  menu:  File-Save. 

9.iv  Add  to  groups 

Used  to  copy  an  object  to  a  different  group.  For  example  a  daata  cube  in  the  “Datacubes” 
group  could  be  moved  to  the  “Training  Datacubes”  group. 

d.v'  Expand  on  a  basis 

Used  to  project  a  given  data  cube  or  set  of  vectors  onto  a  basis  object.  Select  the  object  to 
be  projected  in  the  Object  TreeView,  and  then  the  basis  to  project  onto  from  the  drop¬ 
down  menu  of  the  toolbar. 

9.vi  Appiy  noniinear  map 

Used  to  apply  to  a  nonlinear  map  object  to  a  data  cube  or  set  of  vectors.  Select  in  the 
Object  TreeView  the  object  the  map  should  be  applied  to,  and  then  the  nonlinear  map  to 
apply  from  the  drop-down  menu  of  the  toolbar. 

9.vii  Extract  iabeiied  subset 

Extracts  a  subset  of  a  training  set  or  in  general  of  a  set  of  vectors  with  labels,  based  on  a 
choice  of  labels.  Very  useful  if  one  wants  to  study  further  one  class  obtained  through 
some  classifier:  this  class  can  be  extracted  using  this  feature  and  then  further  processed 
with  any  algorithm. 

9. via  Ciassification  to  training  set 

Converts  a  classification  object  into  a  training  set,  i.e.  a  set  of  vectors  with  the  labels 
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assigned  by  the  elassifications. 

9.ix  Extract  regular  subset 

Allows  to  extraet  a  subset  from  a  data  cube  in  a  regular  fashion,  allowing  to  downsample 
the  set  in  a  geometrically  sensible  way.  For  example  one  can  discard  all  even  rows  and 
columns  in  the  x  and  y  coordinates  and  keep  only  one  sample  every  8  in  the  spectral 
dimension. 

9.x  Extract  random  subset 

Extracts  a  random  subset  from  a  data  cube.  Can  be  used  to  downsample  randomly  a  data 
set,  run  a  learning  algorithm  on  the  random  subset  (due  to  memory  and/or  time 
constraint),  and  then,  if  possible,  extend  the  results  from  the  random  subset  to  the  whole 
set.  This  is  tipically  done,  for  example,  in  order  to  compute  the  Laplacian  eigenfunctions 
(see  the  example  in  this  user's  guide). 

9.xi  Extract  basis  subset 

Allows  the  user  to  pick  only  a  few  vectors  out  of  the  collection  of  vectors  in  a  basis 
objects.  For  example,  after  computing  the  principal  components,  one  may  want  to  keep 
only  the  few  top  vectors.  Same  would  happen  for  LDB  basis  vectors. 

9.xii  View  variable  description 

Shows  a  description  of  the  currently  selected  object  in  the  Object  TreeView. 

9.xiii  View  variable  log 

Shows  the  log  of  the  actions  performed  on  the  selected  object  in  the  Object  TreeView. 

9.xiv  View  application  log 

Show  the  log  of  the  actions  performed  by  the  whole  application. 

9.XV  Delete  selected  object 

Deletes  the  object  selected  in  the  Object  TreeView. 
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10  Files  and  Names  of  Variables 


10.i  Names  of  variables 

Names  of  variables  can  only  contain  letters,  numbers  and  underscores,  and  cannot  begin 
by  a  number.  In  particular,  spaces,  deshes,  columns,  dots,  commas  etc. ..are  not  allowed  in 
any  variable  name. 

10.ii  File  names 

File  names  are  subject  to  the  same  rules  as  variable  names.  Note  that  this  poses 
considerable  restrictions  on  fdenames  with  respect  to  the  Universal  Naming  Conventions 
standards. 
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11  Algorithms 

11. i  Layout  and  getting  help 

This  is  how  the  Algorithm  page  may  look  like.  Differences  from  the  look  on  your  system 
are  due  to  different  configuration,  installation  or  system  settings. 

The  Algorithm  page  ontains  itself  several  pages,  one  per  available  algorithms. 


Access  training  page 


Refresh  list  of  inputs. 


List  of 
algorithms 


Inputs  to  algorithm 


Help  button 


Algorithms 


£  Algorithm  pages 
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Sonified 

ition 

Nearest  Neighbor  Classifier 
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Classifv 


□  E  xamplel  0  nS  VD  B  asis_0 

□  NormalizedData 

G  NormalizedData_rndO  j  _ 

□  Eig^enjmage  ' 
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-  Parameters - 

Estimation  method 
Metric: 


o 


S  iZ 


[is  3  nearest  neighbors 

j  euclidean  3 


Run 


Parameters  of  the  algorithm 


Run  the  algorithm 


Different  sets  of  algorithm  may  be  installed  on  your  system. 

Technical  Note:  Algorithms  can  be  designed  so  that  they  plug-in  quite  naturally  into 
the  HSE  engine,  and  we  are  considering  releasing  tools  that  will  make  this  process 
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accessible  to  third  parties. 

Every  algorithm  page  has  its  own  title,  a  small  toolbar  with  a  refresh  and  a  help  button, 
an  area  for  input  selection,  an  area  for  parameter  selection,  and  a  run  button  at  the  very 
bottom. 

The  refresh  button  refreshes  the  list  of  available  input  parameters,  when  this  is  not 
synchronized  with  the  actual  object  list. 

The  help  button  provide  algorithm  specific  help  on  what  the  algorithm  does,  on  its 
input  parameters,  etc. 

Example.  The  algorithm  Principal  Components  snapshot  above.  There  is  only  one 
input  category  for  this  algorithm,  described  as  “datacubes”.  The  object  category  from 
which  admissible  input  objects  can  be  selected  is  of  course  “Datacubes”.  These  objects 
are  listed  (if  the  list  is  not  updated,  just  press  the  refresh  button),  and  the  user  can 
select  one  or  more  datacubes  to  which  to  apply  the  algorithm.  The  algorithm  will  be 
applied  to  each  input  datacube,  in  a  serial  fashion.  The  parameter  section  includes  the 
two  parameters  to  the  algorithm:  the  number  of  principal  components  to  compute  and 
how  many  random  samples  to  use  for  the  computation.  When  the  user  presses  'Run' 
after  selecting  the  input  objects  and  the  selected  the  desired  parameters,  he  will  be 
asked  for  the  name  of  the  (one)  output  object.  Should  the  algorithm  have  more  than 
one  output,  the  names  of  all  the  outputs  would  be  asked,  one  at  a  time.  The  output  is  a 
basis,  and  it  will  automatically  added  to  the  correct  category  when  the  algorithm 
returns  it. 


Inputs 

The  input  parameters  area  contains  a  set  of  controls  that  allow  the  user  to  select  which 
objects  should  be  input  to  the  algorithm.  Many  algorithms  have  just  one  input  of  one  type, 
but  in  general  an  algorithm  can  have  many  inputs.  Inputs  are  divided  by  “conceptual  role” 
in  the  algorithm,  inputs  with  the  same  “conceptual  role”  come  from  the  same  object 
category,  and  can  be  a  single  object  or  multiple  objects  from  the  category,  depending  on 
the  algorithm. 

Parameters 

This  section  of  the  User  Interface  allows  the  user  to  select  the  desidered  parameters  for 
the  algorithms.  The  role  and  admissible  range  of  each  parameters  are  detailed  in  the  help 
to  the  algorithm,  which  can  be  obtained  by  pressing  the  help  button  on  the  algorithm's 
toolbar. 

Outputs 

Just  before  running,  the  algorithm  will  ask  the  user  for  the  name(s)  of  the  output  variable 
(s).  After  the  algorithm  is  run,  the  output  objects  will  be  added  to  the  set  of  avilable 
objects,  each  in  the  appropriate  category. 
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1 1.ii  Basic  manipulations 

Short  Description 

Basic  manipulations  on  a  vector  set 

Detailed  Description 

Allows  to  perform  various  basie  manipulations  of  a  vector  set  or  a  data  cube,  sueh  as  a 
adding  a  constant,  and  thresholding  from  below  and  above.  The  operations  are  performed 
in  the  order  in  which  they  are  listed. 

Parameters 


Label 

Description 

Absolute  value 

Takes  the  absolute  value  of  all  the  veetor  eomponents  of  each  vector. 

Add  Constant 

Adds  a  constant  to  all  the  vector  components  of  each  vector. 

Threshold 

below 

Sets  to  the  speeified  threshold  all  the  elements  below  the  speeified 
threshold. 

Threshold 

above 

Sets  to  the  threshold  all  the  elements  above  the  specified  threshold. 

Inputs 


Name 

Description 

Multiplicity 

Veetor  Set 

The  veetor  set  to  be  operated  upon. 

0 

Outputs 


Name 

Description 

Multiplicity 

Output  Vector  Set 

The  result  of  the  manipulation  of  the  input  veetor 
set. 

Vector  Set 

References 


Label  Title 
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1 1.iii  Calculator 


Short  Description 

Calculator  between  data  cubes. 

Detailed  Description 


Allows  various  computations  to  be  done  on  datacubes. 

Parameters 


Label 

Description 

Operation  to  perform 

The  operation  to  be  performed  on  the  two  datasets 

•  +:  Sum  of  the  two  vector  sets. 

•  -:Difference  of  the  two  vector  sets. 

•  *:Product  of  the  two  vector  sets. 

•  /:Ratio  of  the  two  vector  sets.  Division  by  0  yields  0.. 

Inputs 


Name 

Description 

Multiplicity 

Vector  Set  1 

The  set  of  points  for  the  first  operand. 

1 

Vector  Set  2 

The  set  of  points  for  the  second  operand. 

1 

Outputs 


Name 

Description 

Multiplicity 

Output  Vector  Set 

The  result  of  the  operation  between  the  two 
datacubes. 

1 

References 

Label  Title 
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11.  iv  Unsupervised  Hierarchical  Clustering 

Short  Description 

Finds  clusters  in  dataset  in  an  unsupervised  way. 

Detailed  Description 

Uses  distanee  relationships  between  points,  in  a  variety  of  metrics  and  with  different 
definitions  of  similarities  between  groups  of  points,  to  find  elusters  in  the  dataset.  The 
algorithm  is  randomized  to  allow  the  analysis  of  large  datasets:  one  or  more  selections  of 
random  groups  are  followed  by  a  hirarchieal  elustering  on  eaeh  group,  then  the  resulting 
clusters  are  matehed  among  the  various  sampling  rounds  to  yield  a  more  robust  result. 
The  algorithm  is  in  general  badly  affeeted  by  high  dimensionality  of  the  data,  since  it 
based  on  distanee  eomputations  of  in  general  noisy  data.  Usually  best  results  are  obtained 
by  running  this  algorithm  after  an  initial  dimension  reduetion 

Parameters 

Label  Description 

The  type  of  metrie  to  use  when  discovering  elusters.  Please  note  that  for  the 
density  estimation  the  standard  euelidean  metrie  is  always  used,  no  matter 
which  type  of  density  estimation. 

•  euclidean:  standard  euclidean  distanee. 

•  seuclidean:  standardized  Euelidean  distanee,  eaeh  coordinate  in  the  sum 
of  squares  is  inverse  weighted  by  the  sample  variance  of  that 
eoordinate. 

Metri  *  eityblock:city  block  (or  L^Mnfty)  distance. 

Q  •  mahalanobis:mahalanobis  distanee. 

•  minkowski:minkowski  distance  with  exponent  2. 

•  eosine:one  minus  the  cosine  of  the  angle  between  the  samples  (treated 
as  veetors). 

•  eorrelation:one  minus  the  sample  correlation  between  samples  (treated 
as  sequences  of  values). 

•  hamming: hamming  distance,  percentage  of  eoordinates  that  differ. 

•  jaeeard:one  minus  the  Jaecard  coefficient,  the  percentage  of  nonzero 
eoordinates  that  differ. 
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Type  of  linkage  between  points  to  discover  clusters 

Linkage 

type 

•  single:. 

•  complete: furthest  distance. 

•  average: average  distance. 

•  centroid: center  of  mass  distance.  N.B.:  the  output  is  meaningful 
only  if  euclidean  distance  is  used. 

•  ward:  inner  squared  distance. 

Number  Of 
Classes 

How  many  clusters  to  look  for. 

Number  of 

Random 

Points 

Determines  how  many  random  points  per  cycle  are  considered. 

Number  of 
Cycles 

How  many  clustering  cycles  to  do;  each  cycle  is  on  a  different  random 
subset  of  points,  the  clusters  obtained  from  different  cycles  are  then 
matched  across  cycles  to  yield  a  coherent  final  labelling. 

Density  estimation  for  the  test  points.  If  nearest  neighbor,  allows  to  select 
how  many  nearest-neighbors  to  consider  when  assigning  a  label  to  a  test 
point.  If  pdf  estimation,  the  probability  density  is  estimated  with  smooth 
kernels  around  the  training  points. 

Estimation 

Method 

•  nearest  neighbor:estimate  the  class  densities  by  putting  a  point  in 
a  class  if  that  class  is  the  closest  one  to  the  point,  where  the 
distance  between  a  point  and  a  class  is  the  distance  between  the 
point  and  the  closest  point  in  the  class.  N.B.:  uses  euclidean 
metric  for  nearest  neighbor,  independently  of  the  choice  of  the 
parameter  <Metric>  above. 

•  decaying  pdfestimate  the  class  densities  by  smooth  interpolation 
with  a  decreasing  radial  kernel.  N.B.:  the  norm  in  the  definition  of 
the  dereasing  kernel  is  euclidean,  independently  of  the  parameter 
<Metric>. 

Inputs 


Name 

Description 

Multiplicity 

Datacubes 

The  datacubes  to  be  clustered. 

0 

Outputs 


Name 

Description 

Multiplicity 

Classificatio 

ns 

These  datacubes  (one  per  input  datacube)  are  the 
classifications  obtained  from  running  the  algorithm.  The  i- 
th  slice  of  each  at  each  point  is  the  probability  of  that  point 
belonging  to  the  i-th  cluster  found. 

Datacubes 
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References 


Label 

Title 

[1] 

http://www.resample.eom/xlminer/help/HClst/HClst_intro.htm 

[2] 

http://www.cs.umd.edu/hcil/multi-cluster/index.html 
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11. V  Laplacian  Eigenfunctions  computations 

Short  Description 

Computes  the  Laplacian  Eigenfunctions  of  the  data  set. 

Detailed  Description 

Computes  the  Laplacian  Eigenfunctions  of  the  data  set,  by  vieweing  the  dataset  as  a 
weighted  graph  and  computing  the  eigenfunctions  of  a  version  of  the  Laplace  operator 
restricted  to  the  dataset.  Various  type  of  eigenfunctions  can  be  computed,  according  to 
the  normalization  type  for  the  operator  restricted  to  the  dataset,  and  other  parameters  for 
its  computation  (e.g.  number  of  neighbors  to  consider,  width  of  the  kernel,  precision 
etc..).  This  is  usually  very  useful  for  nonlinear  dimensionality  reduction,  for  discovering 
interesting  parametrization  and  low-dimensional  embeddings  of  the  data  set.  Clustering 
algorithms  are  expected  to  perform  better  after  the  map  derived  from  the  Laplace 
eigenfunctions  is  applied  to  the  data  since  cluster  are  in  general  enhanced 

Parameters 


Label 

Description 

EigenFunction 

Type 

Type  of  eigenfunctions:  nearest-neighbor  or  gaussian. 

•  nn:uses  (a  symmetrization  of)  the  kernel  K(x,y)=l  if  x  is  a  k 
nearest  neighbor  of  y,  K(x,y)=0  otherwise. 

•  gaussiuses  (a  symmetrization  of)  the  kernel  K(x,y)=e^(-|  x- 
y|  /delta)  if  x  is  a  k  nearest  neighbor  of  y,  K(x,y)=0 
otherwise.. 

Number  of 
Eigenfunctions  to 
compute 

How  many  eigenfunctions  to  compute.  The  maximum  number  is 
equal  to  the  cardinality  of  the  dataset,  normally  far  fewer  (tens  or  a 
few  hundreds,  independently  of  the  cardinality  of  the  dataset,  and 
only  depended  on  some  notion  of  complexity  of  the  data)  are 
computed. 
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Type  of 
Normalization 


Which 

Eigenvalues 


Number  of  nn 


Delta 


Precision 

Energy 

Threshold 

Maximum 
number  of 
Eigenvalues 

Inputs 


How  the  operator  on  the  dataset  is  normalized.  In  the  following  D(x,x) 
is  the  sum  of  K(x,y)  for  all  y's,  D(x,z)=0  if  z  is  not  x,  while  W(x,y)  is 
the  weights  between  x  and  y  (1  if  <Eigenfunction  type>=='nn',  a 
weighted  exponential  if  <Eigenfunction  type>='gauss'). 

•  Laplace-Beltramiithe  operator  is  normalized  as  in  graph  theory, 
but  the  eigenfunctions  are  rescaled  and  are  the  same  as  the 
normalization  D^{-1}*(D-W).  The  advantage  of  this 
normalization  is  that  the  problem  is  symmetric.. 

•  D^{-l/2}*(D-W)*D^{-l/2}:the  operator  is  normalized  as  in 
graph  theory.  The  problem  is  symmetric,  but  the  oeprator  is  not 
an  averaging  operator,  and  it  does  not  coincide  in  general  with 
the  Laplace-Beltrami  operator  on  a  manifold.. 

•  D^{-1  }*(D-W):the  operator  is  averaging  on  the  set,  but  the 
problem  is  not  symmetric.  Its  eigenfunctions  are  the  same  as 
"Laplace-Beltrami",  but  in  that  case  they  are  computed  via  a 
symmetric  problem.. 

•  D-W:the  operator  is  symmetric  but  not  averaging,  the  density 
of  points  is  not  normalize  out. 

•  W:no  normalization. 

Which  eigenvalues  should  be  computed:  small  or  large. 

•  small: compute  the  small  eigenvalues  (they  start  from  0  up). 

•  large:compute  the  large  eigenvalues. 

Number  of  nearest  neighbors  to  consider  in  the  computation  of  the 
eigenfunctions.  Should  be  large  enough  especially  if  <EigenFunction 
Type>  is  'gauss',  but  not  large,  especially  if  <EigenFunction  Type>  is 
'nn',  otherwise  the  graph  considered  is  complete. 

Scale  of  the  operator.  This  has  no  effect  if  the  Eigenfunction  Type  is 
'nn',  while  it  affects  the  width  of  the  gaussian  kernel  when 
EigenFunction  Type  is  'gauss'.  The  gaussian  kernel  is  written  in  the 
form  e^((-||x-y||/delta)^2). 

Precision  of  the  computations  when  EigenFunction  Type  is  'gauss'. 

Keep  only  enough  Eigenvalues  to  cover  the  given  fraction  of  total 
energy.  The  total  energy  is  estimated  only  on  the  parameter  "Number 
of  EigenFunctions  to  compute". 

Maximum  number  of  Eigenvalues  to  compute. 
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Name 

Description 

Multiplicity 

Vector  Set 

The  set  of  points  on  which  to  compute  the  eigenfunctions. 

1 

Outputs 


Name 

Description 

Multiplicity 

EigenMap 

The  EigenMap  associated  with  the  computed 
Eigenfunctions 

I 

EigenMap 

Images 

The  image  under  the  Eigenmap  of  the  input  <LabeIIed 
Training  Set>. 

I 

References 


Label 

Title 

[1] 

S.  Lafon,  R.R.  Coifman,  Geometric  Harmonics,  Tech  Report,  CS  Dept.,  Yale 
University,  2003 

[2] 

S.  Lafon,  R.R.  Coifman,  Diffusion  Maps  and  Geometric  Harmonics,  Tech 
Report,  CS  Dept.,  Yale  University,  2004 

[3] 

R.R.  Coifman,  M.  Maggioni,  Multiresolution  Analysis  associated  to  diffusion 
semigroups:  construction  and  fast  algorithms.  Tech  Report,  CS  Dept.,  Yale 
University,  2004 
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11.  vi  Local  Discriminant  Bases 


Short  Description 

Finds  local  features  that  well  discriminate  the  classes  as  labelled  in  the  training  set. 

Detailed  Description 

Uses  fast  Fourier  and  wavelet  algorithms  to  look  up  a  dictionary  for  feature  vectors  that 
well-discriminate  between  the  classes  in  the  Labelled  Training  Set  [1],  Offers  several 
choice  of  libraries  of  bases,  based  on  windowed  local  cosines,  wavelets  and  other  filter 
banks. 

Parameters 


Label 

Description 

Discrimination 

Measure 

Determines  which  discrimination  measure  and  cost  are  associated 
with  each  the  projection  of  the  points  onto  a  feature. 

•  SRE:symmetric  KL  distance  on  the  coefficients. 

•  ARE: asymmetric  KL  distance  on  the  coefficients. 

•  SED: Square  Euclidean  Distance  on  the  coefficients. 

Features 

Determines  in  which  dictionary  to  search  the  feature  vectors: 
options  include  various  wavelet  packet  dictionaries  and  sine  and 
cosine  libraries. 

•  Haar:Haar  wavelets. 

•  Beylkin:Beylkin  wavelets. 

•  Coiflet:Coifman  wavelets  (maximum  number  of  vanishing 
moments). 

•  Daubechies:Daubechies  compactly  supported  wavelets. 

•  Symmlet:Symmlet  wavelets,  which  are  'maximally 
symmetric'. 

•  Vaidyanathan:Vaidyanathan  filters. 

Anisotropic  Ldb 

Determines  whether  to  use  the  most  flexible  Haar-Walsh  tiling  of 
the  time-frequency  plane. 

Inputs 


Name 

Description 

Multiplicity 

Labelled 
Training  Set 

The  set  of  vectors  with  labels  to  be  used  for  training  set. 

The  vectors  are  all  of  the  same  dimensions,  while  the  labels 
are  integer  numbers  starting  from  1 . 

1 

Outputs 
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Name 

Description 

Multiplicity 

Feature 

Vectors 

Set  of  feature  vectors.  These  are  vectors  in  the  same  space  as 
the  vectors  in  the  Labelled  Training  Set  which  are  the  most 
discriminant  (according  to  the  selected  criterion)  among  the 
vectors  in  the  chosen  dictionary. 

1 

References 


Label  Title 

N.  Saito,  R.Coifman,  F.B.  Geshwind,  and  F.  Warner,  Discrminant  feature 
[1]  extraction  using  empirical  probability  density  estimation  and  a  local  basis 
library,  Pattern  Recognition,  2002 
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11.vii  Nearest  Neighbor  Classifier 

Short  Description 

Basic  nearest  neighbor  classifier. 

Detailed  Description 

Nearest  neighbor  classifier  assigns  to  a  point  the  label  that  appears  most  often  among  the 
k  nearest  neighbors  of  the  point.  It  is  a  very  basie  elassifier,  but  it  asymptotically  (in  the 
number  of  points)  close  to  the  Baysian  elassifier.  It  is  quite  efficient  in  low  dimension, 
but  in  general  quite  enreliable  in  high  dimensions,  sinee  it  is  very  avversely  affected  by 
noisy  distanee  computations. 

Parameters 


Label 

Description 

Estimation 

Method 

Specifies  how  many  nearest  neighbors  are  used  to  vote  for  the  label  of  a 
test  point 

Metrie 

Specifies  whieh  metrie  to  use  to  find  the  nearest  neighbors.  Either  the 
standard  euclidean  metrie  in  the  dimension  of  the  data  or  the  "maximum" 
metrie  (the  distanee  between  two  points  being  the  maximum  of  the 
distanees  of  the  projeetion  on  any  eoordinate  axis)  ean  be  specified. 

•  euclidean:use  Euelidean  metrie. 

•  maximum:use  maximum  or  L^\infty  metrie. 

Inputs 


Name 

Description 

Multiplicity 

Labelled 

Training 

Set 

This  is  the  set  of  labelled  training  points.  To  estimate  the 
label  of  a  test  point,  the  nearest  points  in  this  training  set  are 
determined,  and  the  label  most  represented  among  them  is 
ehosen  as  the  label  of  the  test  point. 

1 

Classify 

These  are  the  datacubes  to  classify. 

0 

Outputs 


Name 

Description 

Multiplicity 

Classificatio 

ns 

These  are  the  elassification  datacubes:  the  i-th  sliee  of  the 
dataeube  represents  the  probability  that  the  point  belongs 
to  elass  i. 

Classify 

References 
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Label 


Title 

[1]  http://www.cs.sunysb.edu/~algorith/files/nearest-neighbor.shtml 

[2]  http://www.physik3.gwdg.de/tstool/index.html 

1  l.viii  Nearest  Neighbor  Projection 

Short  Description 

Maps  to  the  set  of  distances  from  training  classes. 

Detailed  Description 

Given  a  training  set  consisting  of  k  training  classes,  and  a  set  of  test  vectors,  this 
algorithms  computes  the  map  that  maps  each  test  vector  in  the  vector  of  distances  from 
each  of  the  k  training  classes.  Various  options  for  which  distance  to  use  are  given, 
including  nearest  point  distance,  further  point  distance,  average  distance,  distance  from 
approximate  linear  subspace  and  so  on.  The  choice  of  distance  is  application,  data  set 
and  objective  dependent,  and  in  general  it  greatly  affects  the  result  of  the  algorithm 

Parameters 

Description 

Specifies  the  metric  to  use  when  measuring  distances  between  a  point  x  and 
a  set  A. 

•  min:d(x,A)  is  the  distance  (measured  according  to  <Metric>)  between 
X  and  the  closest  point  to  x  in  A. 

•  max:d(x,A)  is  the  distance  (measured  according  to  <Metric>) 
between  x  and  the  farthest  point  to  x  in  A. 

•  mean:d(x,A)  is  the  average  distance  (measured  according  to 
<Metric>)  between  x  and  the  points  in  A. 

•  center:d(x,A)  is  the  distance  (measured  according  to  <Metric>) 
between  x  and  the  center  of  A. 

•  subspace:d(x,A)  is  the  norm  of  the  component  of  x  orthogonal  to  the 
subspace  spanned  by  A.  The  subspace  spanned  by  A  is  defined  by  A 
and  some  tolerance  (e.g.  because  of  noise),  and  is  the  one  spanned  by 
the  top  principal  components  of  the  datapoints  in  A  truncated  by 
tolerance.. 


Label 


Point  to 
set 

metric 
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Estimation 

method 

Specifies  how  many  nearest  neighbors  are  used  to  vote  for  the  label  of  a 
test  point 

Metric 

Specifies  which  metric  to  use  to  find  the  nearest  neighbors.  Either  the 
standard  euclidean  metric  in  the  dimension  of  the  data  or  the  "maximum" 
metric  (the  distance  between  two  points  being  the  maximum  of  the 
distances  of  the  projection  on  any  coordinate  axis)  can  be  specified. 

•  euclidean:  Standard  Euclidean  metric. 

•  maximum: Maximum  or  L^\infty  metric:  the  distance  between  two 
points  is  given  by  the  biggest  of  the  absolute  values  of  the 
differences  of  the  coordinates  of  the  two  points.. 

Inputs 


Name 

Description 

Multiplicity 

Labelled 

Training 

Set 

This  is  the  set  of  labelled  training  points.  To  compute  the 
NN-projection  of  a  test  point,  the  nearest  points  in  this 
training  set  are  determined,  and  the  i-th  coordinate  of  the 
result  point  is  the  distance  from  the  closest  training  point  in 
class  i. 

1 

NN- 

project 

These  are  the  datacubes  to  be  NN-project-ed. 

0 

Outputs 


Name 

Description 

Multiplicity 

NN- 

Projection 

Map 

This  is  the  nonlinear  NN-Projeection  Map  learnt  from  the 
training  set.  It  can  be  applied  to  any  data  (living  in  a 
corectly  dimensioned  ambient  space). 

1 

NN- 

Projections 

These  are  the  NN-projections  of  the  datacubes  chosen  in 
the  input  "NN-project". 

NN-project 

References 


Label 


Title 
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1 1.ix  Spectra  Normalization 

Short  Description 

Normalizes  the  speetra  of  a  data  cube  in  various  ways. 

Detailed  Description 


Basic  algorithms  can  be  used  to  normalize  the  spectra  in  several  ways. 

Parameters 


Label 

Description 

Normalization 

Type 

Type  of  normalization  to  be  applied  to  all  the  spectra  in  the  datacube, 
one  by  one. 

•  LinearOl  iMaps  the  range  of  each  spectrum  linearly  onto 
[0,1].. 

•  L2:Normalizes  each  spectrum  so  that  its  Euclidean  length,  or 
energy,  is  equal  to  L. 

•  Mean  +  L2:Normalizes  each  spetrum  so  that  its  average  is  0 
and  its  Euclidean  length,  or  energy,  is  equal  to  1.. 

Inputs 


Name 

Description 

Multiplicity 

Datacube 

s 

The  data  cubes  whose  spectra  need  to  be  normalized.  The 
user  can  specify  one  or  more  data  cubes. 

0 

Outputs 


Name 

Description 

Multiplicity 

Normalized 

Datacubes 

The  data  cubes  resulting  from  the  normalization 
process. 

Datacubes 

References 

Label  Title 
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1 1.x  Principal  Component  Analysis 

Short  Description 

Computes  the  principal  components  of  the  set  of  points 

Detailed  Description 

Computes  the  principal  components  of  the  set  of  points  [1,2,3]-  These  can  be  defined  as 
the  axes  of  maximum  variance  of  the  set  of  points.  The  algorithm  is  randomized  in  order 
to  allow  the  computation  for  large  datasets:  a  random  subset  is  selected  and  its  principal 
components  are  computed  and  returned.  The  size  of  the  random  subset  is  specified  as  a 
parameter  by  the  user.  The  user  also  specifies  the  number  of  principal  vectors  to  be 
computed.  Because  of  randomization,  if  the  random  subset  is  small,  in  general  one  can 
expect  a  good  relative  estimate  for  the  top  eigenvectors,  but  not  for  the  eigevectors 
except  the  few  top  on 

Parameters 


Label 

Description 

Number  of 
PC's 

Number  of  principal  components  to  compute  and  return,  expressed  as  a 
percentage  of  the  total  number  of  principal  components  (which  is  the 
same  as  the  dimensionality  of  the  space). 

Size  of 
random 
subset 

Specifies  the  size  of  the  random  subset  of  points  actually  used  for  the 
computation,  expressed  as  a  percentage  of  the  total  number  of  samples 
available. 

Inputs 


Name 

Description 

Multiplicity 

Datacube 

s 

The  datacubes  whose  principal  components  need  to  be 
computed,  one  datacube  at  a  time. 

0 

Outputs 


Name 

Description 

Multiplicity 

SVD  Basis 

The  principal  components  of  the  data  set. 

Datacubes 

References 
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Label 

Title 

[1] 

Golub,  Matrix  Computations 

[2] 

Anderson,  E.,  Z.  Bai,  C.  Bischof,  S.  Blackford,  J.  Demmel,  J.  Dongarra,  J.  Du 
Croz,  A.  Greenbaum,  S.  Hammarling,  A.  McKenney,  and  D.  Sorensen, 
LAPACK  User's  Guide 

[3] 

Numerical  recipes  in  C,  www.nr.com 
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12  File  Format  Description 

Hyperspectral  Explorer  can  load  files  of  the  following  types: 

-  Matlab  files,  properly  formatted  as  specified  below. 

-  HSE  files:  these  are  header  files  that  are  created  when  saving  files  from  HyperSpectral 
Explorer  in  his  own  format,  and  is  linked  to  Matlab  files  actually  containing  the  data. 

-  PND  files:  these  are  files  saved  from  NSTIS  application  by  Plain  Sight.  Seamless 
integration  between  NSTIS  actually  allows  to  transfer  automatically  files  from  NSTIS 
application  to  HyperSpectral  Explorer,  via  the  File->Export  command  in  the  NSTIS 
application. 

-  ENVI  files:  these  are  a  widespread  format  for  hyperspectral  images. 


12.i  Matlab  files 

Hyperspectral  Explorer  loads  files  saved  from  Matlab,  with  extension  .mat,  and  satisfying 
the  characteristics  described  below. 

Let  the  filename  be  <Filename.mat>. 

If  a  datacube  is  stored  in  the  file  <Filename.mat>,  then  this  file  should  contain  a  Matlab 
variable  called  <Filename>.  This  variable  should  be  a  structure,  containing  a  field  named 
<Data>.  This  field  contains  the  datacube  values,  as  a  3  dimensional  array,  whose 
coordinates  are  X,Y,S,  i.e.  The  two  spatial  coordinates  first,  and  the  spectral  coordinate 
third. 

If  a  training  set  is  stored  in  the  file  <Filename.mat>,  then  this  file  should  contain  a  Matlab 
variable  called  <Filename>.  This  variable  should  be  a  structure,  containing  a  field  named 
<Data>  and  a  field  named  <Labels>.  The  field  named  <Data>  should  be  a  2  dimensional 
array  N  by  S,  each  row  representing  a  spectrum  in  the  training  set.  The  field  named 
<Labels>  should  be  a  1  dimensional  column  vector  of  length  N,  the  i-th  entry 
representing  the  label  (an  integer  number  greater  than  0)  of  the  i-th  spectrum  in  the  field 
<Data>. 


12.ii  ENVI  files 

ENVI  is  a  widespread  file  format  for  hyperspectral  images.  There  are  two  ENVI  files  for 
each  hyperspectral  cube:  a  header  file  and  a  data  file.  The  header  file  contains  a 
description  of  the  data,  and  other  useful  information  (e.g.  it  may  contain  information 
about  the  frequency  range,  the  date  of  collection  etc...).  The  datafile  contains  the 
hyperspectral  cube. 

Hyperspectral  Explorer  will  import  ENVI  files  that  have  both  header  and  the 
corresponding  data  file.  The  header  should  have  extension  .hdr,  and  the  data  file  should 
have  extension  .img.  When  a  file  is  loaded,  a  dialog  with  the  header  information  of  the 
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file  is  displayed. 

The  file  is  loaded  into  a  Matlab  stnicture  that  eontains  both  the  data  and  the  description 
from  the  header  file,  which  is  thus  accessible  anytime  by  accessing  the  varaible  from  the 
Matlab  command  window  Hyperspectral  Explorer  is  connected  to. 


12.iii  HSE  files 

These  files  are  created  when  saving  variables  from  the  Hyperspectral  Explorer 
workspace.  These  files  are  header  files,  and  contain  pointers  (file  names  with  path 
information)  to  Matlab  files,  as  described  above,  that  actually  contain  the  data.  If  the 
Matlab  files  pointed  to  from  inside  the  HSE  file  are  moved  or  deleted  or  modified,  the 
HSE  file  will  become  unusable. 


12.iv  PND  files 

These  files  are  the  default  output  when  saving  hyperspectral  data  from  the  NSTIS 
application  by  Plain  Sight. 
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13  Error  Description  Tabie 


ERROR 

CAUSE 

SOLUTIONS 

Error  EOOOl:  could  not 
connect  to  Matlab 

wrapper  server. 

The  Matlab  wrapper  server,  in  the  DLL  file 
MatlabEngineWrapper.dll  is  not  correctly 
registered. 

Reinstall  Hyperspectral  Explorer. 

For  the  more  advanced  user:  run  regsvr32 
<dir>/MatlabEngineWrapper.dll  to  re-register  the 
Matlab  wrapper  server 

Error  E0002:  could  not 
connect  to  Matlab  server 
(error  code: -3) 

The  Matlab  wrapper  server  failed  to  connect  to 
the  Matlab  server.  Either  Matlab  is  not  properly 
installed,  or  Matlab  license  server  is  not 
connected,  or  the  user  does  not  have  enough 
privileges  to  connect  to  Matlab. 

Reinstall  Matlab  and  Hyperspectral  Explorer. 

For  the  more  advanced  user:  re-register  the  Matlab 
server  manually.  To  do  so,  either  read  the 
instruction  in  the  Matlab  help,  or  simply  run: 

Matlab  /regserver 

from  the  Matlab  binary  directory  to  re-register  the 
Matlab  server. 

Error  E0003:  failed  to  add 
the  datagroup  <name>  to 
the  datagroup  server. 

One  of  the  data  groups  could  not  be  created.  The 
application  definition  file  has  been  modified 
incorrectly. 

Restore  the  original  application  configuration  XML 
file  AppDefxml,  or  re-install  HyperSpectral 
Explorer  to  restore  it. 

Error  E0004:  some 

resources  could  not  be 
freed.  You  may  have  to 
shut  down  Matlab 

manually. 

While  disconnecting  to  Matlab  and  closing  the 
application,  HyperSpectral  Explorer  could  not 
free  all  the  objects  allocated  by  Matlab,  and/or 
could  not  disconnect  from  the  Matlab  engine. 

If  a  “Matlab  Command  Window”  is  open,  or 
present  in  the  taskbar,  close  it  manually.  If  Matlab 
does  not  close,  go  to  the  Task  Manager  (right-click 
on  the  taskbar,  and  select  Task  Manager),  go  to  the 
“Processes”  tab,  and  terminate  the  Matlab  process. 

Error  E0005 :  failed  to 
load  algorithm  (algorithm 
number  xx). 

At  startup  HyperSpectral  Explorer  connects  to 
the  various  algorithms  available.  One  such 
algorithm  was  not  available,  or  its  configuration 
file  was  not  available  or  was  not  correctly 
formatted. 

Restore  the  application  definition  file  Appdefxml 
and  all  the  algorithm  definition  files  in  the 
Algorithms  directory.  If  not  available,  reinstall 
HyperSpectral  Explorer 

E0006:  failed  to  load  the 
datagroups. 

The  application  definition  file  Appdef  xml  is  not 
properly  formatted. 

Restore  the  original  application  configuration  XML 
file  AppDefxml,  or  re-install  HyperSpectral 
Explorer  to  restore  it. 
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